Shehan Munasinghe's picture

2 11 2

Shehan Munasinghe

shehan97

·

https://shehanmunasinghe.github.io/

AI & ML interests

Computer Vision, Multi-modal learning

Organizations

authored a paper 3 months ago

VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Paper • 2411.04923 • Published Nov 7, 2024 • 23

authored a paper about 2 years ago

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

Paper • 2311.13435 • Published Nov 22, 2023 • 18