garg-aayush commited on
Commit
4b0c31b
·
verified ·
1 Parent(s): d452e6a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # LLaMA-3.1-8B SFT (No Prompt Masking)
2
+
3
+ Fine-tuned LLaMA-3.1-8B using SFT instruction tuning **without prompt masking** (loss computed on all tokens).
4
+
5
+ ## Training Details
6
+ - **Base Model**: meta-llama/Llama-3.1-8B
7
+ - **Dataset**: UltraChat-200K + SafetyLlama (~200K examples)
8
+ - **Training**: 1 epoch (6726 steps)
9
+ - **Prompt Masking**: Disabled (loss on all tokens)
10
+
11
+ ## Evaluation Results
12
+
13
+ | Benchmark | Baseline | This Model |
14
+ |-----------|----------|------------|
15
+ | GSM8K | 16.4% | 29.0% |
16
+ | MMLU | 58.1% | 58.4% |
17
+ | SST Safety | 62.0% | 78.0% |
18
+ | AlpacaEval | 1.57% | **5.3%** |
19
+
20
+ ## Files
21
+ - `eval_baseline/`: Baseline evaluation results (pre-finetuning Llama-3.1-8B)
22
+
23
+ ## Reference
24
+ Part of CS336 Assignment 5 (SFT Instruction Tuning). See [building-from-scratch/sft](https://github.com/garg-aayush/building-from-scratch/tree/main/sft) for details.