view article Article The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator Dec 17, 2025 • 47
Qwen/Qwen3-Coder-30B-A3B-Instruct Text Generation • 31B • Updated Dec 3, 2025 • 682k • • 947
Artificial Hippocampus Networks for Efficient Long-Context Modeling Paper • 2510.07318 • Published Oct 8, 2025 • 31
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution Paper • 2505.20286 • Published May 26, 2025 • 8 • 4
Running on CPU Upgrade 585 GAIA Leaderboard 🦾 585 Submit your model answers to GAIA benchmark and view leaderboard