Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 4 days ago • 174
CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding Paper • 2602.01785 • Published 13 days ago • 93
Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models Paper • 2601.22060 • Published 17 days ago • 152
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation Paper • 2602.02402 • Published 13 days ago • 32
Green-VLA: Staged Vision-Language-Action Model for Generalist Robots Paper • 2602.00919 • Published 15 days ago • 279
3D-Aware Implicit Motion Control for View-Adaptive Human Video Generation Paper • 2602.03796 • Published 12 days ago • 57
DynamicVLA: A Vision-Language-Action Model for Dynamic Object Manipulation Paper • 2601.22153 • Published 17 days ago • 68
Running on Zero MCP Featured 1.98k Qwen Image Edit Camera Control 🎬 1.98k Fast 4 step inference with Qwen Image Edit 2509
Running on Zero MCP Featured 122 Z Image 🏃 122 Generate high‑quality images from text prompts with Z‑Image
EvoCUA: Evolving Computer Use Agents via Learning from Scalable Synthetic Experience Paper • 2601.15876 • Published 24 days ago • 90