LLM - a meigel Collection

meigel 's Collections

GAN & PINN & operator learning

LLM

updated Jul 28, 2025

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17, 2025 • 115
ProcessBench: Identifying Process Errors in Mathematical Reasoning

Paper • 2412.06559 • Published Dec 9, 2024 • 86
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

Paper • 2412.15084 • Published Dec 19, 2024 • 13
The Lessons of Developing Process Reward Models in Mathematical Reasoning

Paper • 2501.07301 • Published Jan 13, 2025 • 100
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8, 2025 • 288
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Paper • 2501.03226 • Published Jan 6, 2025 • 43
System-2 Mathematical Reasoning via Enriched Instruction Tuning

Paper • 2412.16964 • Published Dec 22, 2024 • 2
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

Paper • 2501.04686 • Published Jan 8, 2025 • 53
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 140
Reasoning Language Models: A Blueprint

Paper • 2501.11223 • Published Jan 20, 2025 • 33
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement

Paper • 2501.12273 • Published Jan 21, 2025 • 14
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

Paper • 2501.09686 • Published Jan 16, 2025 • 41
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14, 2025 • 300
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

Paper • 2501.10799 • Published Jan 18, 2025 • 15
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published Feb 11, 2025 • 50
Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving

Paper • 2502.07640 • Published Feb 11, 2025 • 9
The Stochastic Parrot on LLM's Shoulder: A Summative Assessment of Physical Concept Understanding

Paper • 2502.08946 • Published Feb 13, 2025 • 191
Logical Reasoning in Large Language Models: A Survey

Paper • 2502.09100 • Published Feb 13, 2025 • 24
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25, 2025 • 75
BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving

Paper • 2502.03438 • Published Feb 5, 2025 • 2
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6, 2025 • 113
Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24, 2025 • 317