LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published 12 days ago • 52
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 17 days ago • 19
RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing Paper • 2512.16864 • Published 17 days ago • 10
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 17 days ago • 19
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26, 2025 • 111
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control Paper • 2511.18922 • Published Nov 24, 2025 • 11
One4D: Unified 4D Generation and Reconstruction via Decoupled LoRA Control Paper • 2511.18922 • Published Nov 24, 2025 • 11
GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal Paper • 2404.13679 • Published Apr 21, 2024 • 1
F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting Paper • 2501.06714 • Published Jan 12, 2025 • 2
GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal Paper • 2404.13679 • Published Apr 21, 2024 • 1
F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Aggregative Gaussian Splatting Paper • 2501.06714 • Published Jan 12, 2025 • 2
HyRF: Hybrid Radiance Fields for Memory-efficient and High-quality Novel View Synthesis Paper • 2509.17083 • Published Sep 21, 2025 • 7
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published Apr 17, 2025 • 93
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12, 2025 • 38