Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles Paper • 2602.01590 • Published 8 days ago • 32
WildGraphBench: Benchmarking GraphRAG with Wild-Source Corpora Paper • 2602.02053 • Published 8 days ago • 40
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents Paper • 2602.01566 • Published 8 days ago • 45
QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management Paper • 2512.12967 • Published Dec 15, 2025 • 108