Evolving Deeper LLM Thinking
Paper
• 2501.09891
• Published
• 115
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper
• 2412.06559
• Published
• 86
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward
Modeling
Paper
• 2412.15084
• Published
• 13
The Lessons of Developing Process Reward Models in Mathematical
Reasoning
Paper
• 2501.07301
• Published
• 100
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published
• 288
BoostStep: Boosting mathematical capability of Large Language Models via
improved single-step reasoning
Paper
• 2501.03226
• Published
• 43
System-2 Mathematical Reasoning via Enriched Instruction Tuning
Paper
• 2412.16964
• Published
• 2
URSA: Understanding and Verifying Chain-of-thought Reasoning in
Multimodal Mathematics
Paper
• 2501.04686
• Published
• 53
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open
Language Models
Paper
• 2402.03300
• Published
• 140
Reasoning Language Models: A Blueprint
Paper
• 2501.11223
• Published
• 33
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and
Refinement
Paper
• 2501.12273
• Published
• 14
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with
Large Language Models
Paper
• 2501.09686
• Published
• 41
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper
• 2412.16145
• Published
• 38
MiniMax-01: Scaling Foundation Models with Lightning Attention
Paper
• 2501.08313
• Published
• 300
Tensor Product Attention Is All You Need
Paper
• 2501.06425
• Published
• 90
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary
Feedback
Paper
• 2501.10799
• Published
• 15
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction
Paper
• 2502.07316
• Published
• 50
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem
Proving
Paper
• 2502.07640
• Published
• 9
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of
Physical Concept Understanding
Paper
• 2502.08946
• Published
• 191
Logical Reasoning in Large Language Models: A Survey
Paper
• 2502.09100
• Published
• 24
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open
Software Evolution
Paper
• 2502.18449
• Published
• 75
BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic
Theorem Proving
Paper
• 2502.03438
• Published
• 2
START: Self-taught Reasoner with Tools
Paper
• 2503.04625
• Published
• 113
Group Sequence Policy Optimization
Paper
• 2507.18071
• Published
• 317