LLaMA-3.1-8B SFT (No Prompt Masking)

Fine-tuned LLaMA-3.1-8B using SFT instruction tuning without prompt masking (loss computed on all tokens).

Training Details

  • Base Model: meta-llama/Llama-3.1-8B
  • Dataset: UltraChat-200K + SafetyLlama (~200K examples)
  • Training: 1 epoch (6726 steps)
  • Prompt Masking: Disabled (loss on all tokens)

Evaluation Results

Benchmark Baseline This Model
GSM8K 16.4% 29.0%
MMLU 58.1% 58.4%
SST Safety 62.0% 78.0%
AlpacaEval 1.57% 5.3%

Files

  • eval_baseline/: Baseline evaluation results (pre-finetuning Llama-3.1-8B)

Reference

Part of CS336 Assignment 5 (SFT Instruction Tuning). See building-from-scratch/sft for details.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for garg-aayush/llama31-8b-sft-nomask

Finetuned
(1656)
this model

Dataset used to train garg-aayush/llama31-8b-sft-nomask

Evaluation results