ViSAudio: End-to-End Video-Driven Binaural Spatial Audio Generation Paper • 2512.03036 • Published 7 days ago • 20
SPARK: Synergistic Policy And Reward Co-Evolving Framework Paper • 2509.22624 • Published Sep 26 • 17
CapRL: Stimulating Dense Image Caption Capabilities via Reinforcement Learning Paper • 2509.22647 • Published Sep 26 • 32
ObjFiller-3D: Consistent Multi-view 3D Inpainting via Video Diffusion Models Paper • 2508.18271 • Published Aug 25 • 8
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper • 2508.05609 • Published Aug 7 • 29
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper • 2508.05609 • Published Aug 7 • 29
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning Paper • 2508.05405 • Published Aug 7 • 64
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper • 2508.05609 • Published Aug 7 • 29 • 3
Hi3DEval: Advancing 3D Generation Evaluation with Hierarchical Validity Paper • 2508.05609 • Published Aug 7 • 29 • 3
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience Paper • 2508.04700 • Published Aug 6 • 52
ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing Paper • 2506.19848 • Published Jun 24 • 26
GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography Paper • 2504.07083 • Published Apr 9 • 22