xiaotong
xtongji
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
3 days ago
Multi-Task GRPO: Reliable LLM Reasoning Across Tasks
authored
a paper
10 days ago
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving
authored
a paper
10 days ago
Rethinking Large Language Model Distillation: A Constrained Markov
Decision Process Perspective
Organizations
None yet