no code implementations • 18 Apr 2024 • Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin
Multimodal Large Language Models (MLLMs) have demonstrated profound capabilities in understanding multimodal information, covering from Image LLMs to the more complex Video LLMs.