Add evaluation results for GPQA, HLE

#3
by SaylorTwift HF Staff - opened

Evaluation Results

This PR adds evaluation results extracted from the Model Card.

Benchmarks:

  • GPQA: 85.2
  • HLE: 19.4

Files created:

  • .eval_results/gpqa.yaml
  • .eval_results/hle.yaml
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment