Search Results for author: Suyuan Huang

Found 1 papers, 0 papers with code

From Image to Video, what do we need in multimodal LLMs?

no code implementations • 18 Apr 2024 • Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin

Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs.

Video Understanding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.