IF-Bench: Benchmarking and Enhancing MLLMs for Infrared Images with Generative Visual Prompting Paper • 2512.09663 • Published 6 days ago • 3
Taming Modality Entanglement in Continual Audio-Visual Segmentation Paper • 2510.17234 • Published Oct 20 • 4
Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering Paper • 2510.14605 • Published Oct 16 • 4
R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning Paper • 2508.21113 • Published Aug 28 • 110