Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
MercedeSnape 's Collections
Problem Definition
future
Evolve
LLM reasoning
reasoning evaluation
mm thinking
agent reasoning
agent training
RL agent
agent env
mas
model paradigm
MoE
Memory
RAG
KG
Tokenization

agent training

updated 4 days ago
Upvote
-

  • Don't Just Fine-tune the Agent, Tune the Environment

    Paper • 2510.10197 • Published Oct 11, 2025 • 28

    Note 从问题实例而非SFT / RL 方法post-training

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs