arxiv:2411.04923
Shehan Munasinghe
shehan97
AI & ML interests
Computer Vision, Multi-modal learning
Recent Activity
authored
a paper
18 days ago
VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in
Videos
upvoted
a
paper
6 months ago
Sekai: A Video Dataset towards World Exploration
upvoted
a
paper
6 months ago
CASS: Nvidia to AMD Transpilation with Data, Models, and Benchmark