-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
Collections
Discover the best community collections!
Collections including paper arxiv:2505.19590
-
Learning to Reason without External Rewards
Paper • 2505.19590 • Published • 29 -
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Paper • 2502.18581 • Published -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Fractured Chain-of-Thought Reasoning
Paper • 2505.12992 • Published • 23
-
Chronos: Learning the Language of Time Series
Paper • 2403.07815 • Published • 47 -
McGill-NLP/Llama-3-8B-Web
Text Generation • 8B • Updated • 60 • 214 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 276 -
Incentivizing LLMs to Self-Verify Their Answers
Paper • 2506.01369 • Published • 1
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 -
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
Let LLMs Break Free from Overthinking via Self-Braking Tuning
Paper • 2505.14604 • Published • 23 -
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios
Paper • 2505.16944 • Published • 8 -
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper • 2505.15960 • Published • 7 -
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Paper • 2505.15134 • Published • 6
-
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models
Paper • 2412.11605 • Published • 18 -
Byte Latent Transformer: Patches Scale Better Than Tokens
Paper • 2412.09871 • Published • 108 -
Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization
Paper • 2412.17739 • Published • 41 -
SKETCH: Structured Knowledge Enhanced Text Comprehension for Holistic Retrieval
Paper • 2412.15443 • Published • 10
-
Learning to Reason without External Rewards
Paper • 2505.19590 • Published • 29 -
Scalable Best-of-N Selection for Large Language Models via Self-Certainty
Paper • 2502.18581 • Published -
Training Large Language Models to Reason in a Continuous Latent Space
Paper • 2412.06769 • Published • 90 -
Fractured Chain-of-Thought Reasoning
Paper • 2505.12992 • Published • 23
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
Chronos: Learning the Language of Time Series
Paper • 2403.07815 • Published • 47 -
McGill-NLP/Llama-3-8B-Web
Text Generation • 8B • Updated • 60 • 214 -
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 276 -
Incentivizing LLMs to Self-Verify Their Answers
Paper • 2506.01369 • Published • 1