no code implementations • 26 Dec 2023 • Zhengzhuo Xu, Sinan Du, Yiyan Qi, Chengjin Xu, Chun Yuan, Jian Guo
Multimodal Large Language Models (MLLMs) demonstrate impressive image understanding and generating capabilities.
Visual Reasoning