Search Results for author: Rusiru Thushara

Found 1 papers, 1 papers with code

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

1 code implementation • 22 Nov 2023 • Shehan Munasinghe, Rusiru Thushara, Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Mubarak Shah, Fahad Khan

Extending image-based Large Multimodal Models (LMMs) to videos is challenging due to the inherent complexity of video data.

Benchmarking Phrase Grounding +4

198

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.