Xinyu Fang's picture

Xinyu Fang

nebulae09

·

FangXinyu-0913

AI & ML interests

None yet

Recent Activity

authored a paper 2 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

authored a paper 2 days ago

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

authored a paper 2 days ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

View all activity

Organizations

upvoted 4 papers 2 days ago

IWR-Bench: Can LVLMs reconstruct interactive webpage from a user interaction video?

Paper • 2509.24709 • Published Sep 29 • 6

ATLAS: A High-Difficulty, Multidisciplinary Benchmark for Frontier Scientific Reasoning

Paper • 2511.14366 • Published 19 days ago • 15

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published 2 days ago • 37

Nex-N1: Agentic Models Trained via a Unified Ecosystem for Large-Scale Environment Construction

Paper • 2512.04987 • Published 2 days ago • 60

upvoted a paper 12 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

Paper • 2511.19399 • Published 12 days ago • 54

upvoted 3 papers about 1 month ago

Spatial-SSRL: Enhancing Spatial Understanding via Self-Supervised Reinforcement Learning

Paper • 2510.27606 • Published Oct 31 • 27

Video-Thinker: Sparking "Thinking with Videos" via Reinforcement Learning

Paper • 2510.23473 • Published Oct 27 • 84

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27 • 96

upvoted 3 papers about 2 months ago

VideoCanvas: Unified Video Completion from Arbitrary Spatiotemporal Patches via In-Context Conditioning

Paper • 2510.08555 • Published Oct 9 • 63

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published Oct 9 • 109

upvoted a paper 2 months ago

MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use

Paper • 2509.24002 • Published Sep 28 • 173

upvoted 2 papers 3 months ago

CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics

Paper • 2508.18124 • Published Aug 25 • 48

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 208

upvoted a paper 4 months ago

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published Aug 5 • 37

upvoted 2 papers 5 months ago

The Imitation Game: Turing Machine Imitator is Length Generalizable Reasoner

Paper • 2507.13332 • Published Jul 17 • 48

LayerCake: Token-Aware Contrastive Decoding within Large Language Model Layers

Paper • 2507.04404 • Published Jul 6 • 21

upvoted a collection 5 months ago

OpenCompass Multi-Modal Leaderboards

6 items • Updated Jul 24 • 4

upvoted 2 papers 5 months ago

CompassJudger-2: Towards Generalist Judge Model via Verifiable Rewards

Paper • 2507.09104 • Published Jul 12 • 17

Rethinking Verification for LLM Code Generation: From Generation to Testing

Paper • 2507.06920 • Published Jul 9 • 28