1 code implementation • 20 Sep 2023 • Runpei Dong, Chunrui Han, Yuang Peng, Zekun Qi, Zheng Ge, Jinrong Yang, Liang Zhao, Jianjian Sun, HongYu Zhou, Haoran Wei, Xiangwen Kong, Xiangyu Zhang, Kaisheng Ma, Li Yi
This paper presents DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models (MLLMs) empowered with frequently overlooked synergy between multimodal comprehension and creation.
Ranked #1 on Visual Question Answering on MMBench (GPT-3.5 score metric)
1 code implementation • 22 Dec 2022 • Yuxuan Cai, Yizhuang Zhou, Qi Han, Jianjian Sun, Xiangwen Kong, Jun Li, Xiangyu Zhang
Such architectural scheme attributes RevCol very different behavior from conventional networks: during forward propagation, features in RevCol are learned to be gradually disentangled when passing through each column, whose total information is maintained rather than compressed or discarded as other network does.
Ranked #8 on Semantic Segmentation on ADE20K (using extra training data)
no code implementations • CVPR 2023 • Xiangwen Kong, Xiangyu Zhang
Recently, Masked Image Modeling (MIM) achieves great success in self-supervised visual recognition.
1 code implementation • 30 Jul 2022 • Junqiang Huang, Xiangwen Kong, Xiangyu Zhang
We focus on better understanding the critical factors of augmentation-invariant representation learning.