1 code implementation • 14 Mar 2024 • YiFan Li, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Ji-Rong Wen
In this paper, we study the harmlessness alignment problem of multimodal large language models (MLLMs).
1 code implementation • 2 Nov 2023 • Yifan Du, Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Jinpeng Wang, Chuyuan Wang, Mingchen Cai, Ruihua Song, Ji-Rong Wen
By conducting a comprehensive empirical study, we find that instructions focused on complex visual reasoning tasks are particularly effective in improving the performance of MLLMs on evaluation benchmarks.
1 code implementation • 15 Dec 2022 • Hangyu Guo, Kun Zhou, Wayne Xin Zhao, Qinyu Zhang, Ji-Rong Wen
Although pre-trained language models~(PLMs) have shown impressive performance by text-only self-supervised training, they are found lack of visual semantics or commonsense.