1 code implementation • CVPR 2023 • Zheren Fu, Zhendong Mao, Yan Song, Yongdong Zhang
Image-text matching, a bridge connecting image and language, is an important task, which generally learns a holistic cross-modal embedding to achieve a high-quality semantic alignment between the two modalities.
1 code implementation • 29 Nov 2022 • Zheren Fu, Zhendong Mao, Bo Hu, An-An Liu, Yongdong Zhang
They have overlooked the wide characteristic changes of different classes and can not model abundant intra-class variations for generations.