arxiv:2504.19394
Toby Simonds
TamasSimonds
AI & ML interests
None yet
Organizations
None yet
models 7
TamasSimonds/llama3.1-8b-kp-1k-self-play-step-336-sys-prompt
8B • Updated
• 1
TamasSimonds/spiral-qwen2-5-3b-base-KP-1k-self-play-1-1-step-192
3B • Updated
TamasSimonds/spiral-qwen3-8b-base-KP-1k-self-play-1-1-step-192
8B • Updated
• 1
TamasSimonds/spiral-llama-3B-base-KP-1k-self-play-1-1-step-192
3B • Updated
TamasSimonds/Qwen3-4B-KP-no-sys-prompt-1k-self-play-1-1-step-192
4B • Updated
TamasSimonds/spiral-qwen3-4b-base-KP-1k-self-play-1.1_0707T15-09-49
4B • Updated
• 1
TamasSimonds/O1-Llama-3.2-3B
3B • Updated
• 13
datasets 7
TamasSimonds/record-test4
Viewer
• Updated
• 2.19k • 7
TamasSimonds/record-test3
Updated
• 8
TamasSimonds/olympiad-proof-problems
Viewer
• Updated
• 39.8k • 40
TamasSimonds/poker_safety_realignment
Viewer
• Updated
• 70 • 11
TamasSimonds/imo-dataset
Viewer
• Updated
• 370 • 12
TamasSimonds/TextbooksToRLQuestions-100k
Viewer
• Updated
• 108k • 10 • 5
TamasSimonds/ReasonSet
Viewer
• Updated
• 1.78k • 12