LLaMA-3.1-8B SFT (No Prompt Masking)
Fine-tuned LLaMA-3.1-8B using SFT instruction tuning without prompt masking (loss computed on all tokens).
Training Details
- Base Model: meta-llama/Llama-3.1-8B
- Dataset: UltraChat-200K + SafetyLlama (~200K examples)
- Training: 1 epoch (6726 steps)
- Prompt Masking: Disabled (loss on all tokens)
Evaluation Results
| Benchmark | Baseline | This Model |
|---|---|---|
| GSM8K | 16.4% | 29.0% |
| MMLU | 58.1% | 58.4% |
| SST Safety | 62.0% | 78.0% |
| AlpacaEval | 1.57% | 5.3% |
Files
eval_baseline/: Baseline evaluation results (pre-finetuning Llama-3.1-8B)
Reference
Part of CS336 Assignment 5 (SFT Instruction Tuning). See building-from-scratch/sft for details.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for garg-aayush/llama31-8b-sft-nomask
Base model
meta-llama/Llama-3.1-8BDataset used to train garg-aayush/llama31-8b-sft-nomask
Evaluation results
- Accuracy on GSM8Kself-reported29.000
- Accuracy on MMLUself-reported58.400
- Safety Score on Simple Safety Testsself-reported78.000
- LC Win Rate on AlpacaEvalself-reported5.300