view article Article Scaling Test-Time Compute to Achieve Gold Medal at IOI 2025 with Open-Weight Models Oct 20 • 20
view article Article OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models Jul 18 • 50
Scoring Verifiers Collection Benchmarks for evaluating synthetic verifiers like test case generation and code reward models (as found in https://www.arxiv.org/abs/2502.13820). • 2 items • Updated 5 days ago • 2
Scoring Verifiers Collection Benchmarks for evaluating synthetic verifiers like test case generation and code reward models (as found in https://www.arxiv.org/abs/2502.13820). • 2 items • Updated 5 days ago • 2