OmniRad: A Radiological Foundation Model for Multi-Task Medical Image Analysis Paper • 2602.04547 • Published 2 days ago • 1
Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition Paper • 2602.04486 • Published 2 days ago • 6
Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning Paper • 2602.04284 • Published 2 days ago • 12
Horizon-LM: A RAM-Centric Architecture for LLM Training Paper • 2602.04816 • Published 2 days ago • 16
A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces Paper • 2602.03442 • Published 3 days ago • 17
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning Paper • 2602.04634 • Published 2 days ago • 82
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models Paper • 2602.04804 • Published 2 days ago • 44
Enhancing Multi-Image Understanding through Delimiter Token Scaling Paper • 2602.01984 • Published 4 days ago • 5
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention Paper • 2602.01801 • Published 4 days ago • 26
Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models Paper • 2602.02185 • Published 4 days ago • 123
Unified Personalized Reward Model for Vision Generation Paper • 2602.02380 • Published 4 days ago • 18
Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks Paper • 2602.01630 • Published 4 days ago • 46
FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents Paper • 2602.01566 • Published 4 days ago • 44
VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration Paper • 2601.22674 • Published 7 days ago • 5
Rethinking LLM-as-a-Judge: Representation-as-a-Judge with Small Language Models via Semantic Capacity Asymmetry Paper • 2601.22588 • Published 7 days ago • 5
Toward Cognitive Supersensing in Multimodal Large Language Model Paper • 2602.01541 • Published 4 days ago • 16