Search Results for author: Rusiru Thushara

Found 1 papers, 1 papers with code

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

1 code implementation22 Nov 2023 Shehan Munasinghe, Rusiru Thushara, Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Mubarak Shah, Fahad Khan

Extending image-based Large Multimodal Models (LMMs) to videos is challenging due to the inherent complexity of video data.

Benchmarking Phrase Grounding +4

Cannot find the paper you are looking for? You can Submit a new open access paper.