Search Results for author: Rusiru Thushara

Found 2 papers, 2 papers with code

Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs

1 code implementation28 Jun 2024 Sukmin Yun, Haokun Lin, Rusiru Thushara, Mohammad Qazim Bhat, Yongxin Wang, Zutao Jiang, Mingkai Deng, Jinhong Wang, Tianhua Tao, Junbo Li, Haonan Li, Preslav Nakov, Timothy Baldwin, Zhengzhong Liu, Eric P. Xing, Xiaodan Liang, Zhiqiang Shen

To address this problem, we propose $\texttt{Web2Code}$, a benchmark consisting of a new large-scale webpage-to-code dataset for instruction tuning and an evaluation framework for the webpage understanding and HTML code translation abilities of MLLMs.

Code Translation

PG-Video-LLaVA: Pixel Grounding Large Video-Language Models

1 code implementation22 Nov 2023 Shehan Munasinghe, Rusiru Thushara, Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Mubarak Shah, Fahad Khan

Extending image-based Large Multimodal Models (LMMs) to videos is challenging due to the inherent complexity of video data.

Benchmarking Phrase Grounding +4

Cannot find the paper you are looking for? You can Submit a new open access paper.