LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls Paper • 2511.09148 • Published 25 days ago • 16
LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls Paper • 2511.09148 • Published 25 days ago • 16 • 2
VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use Paper • 2509.01055 • Published Sep 1 • 75
Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning Paper • 2510.23473 • Published Oct 27 • 84
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning Paper • 2508.08221 • Published Aug 11 • 49
CodeApex: A Bilingual Programming Evaluation Benchmark for Large Language Models Paper • 2309.01940 • Published Sep 5, 2023 • 1