no code implementations • 27 Apr 2023 • Jiahua Rao, Zifei Shan, Longpo Liu, Yao Zhou, Yuedong Yang
With the recent progress in large-scale vision and language representation learning, Vision Language Pre-training (VLP) models have achieved promising improvements on various multi-modal downstream tasks.