no code implementations • 27 Apr 2024 • Masoud Monajatipoor, Zi-Yi Dou, Aichi Chien, Nanyun Peng, Kai-Wei Chang
Vision-language models have become increasingly powerful for tasks that require an understanding of both visual and linguistic elements, bridging the gap between these modalities.
1 code implementation • 10 Aug 2021 • Masoud Monajatipoor, Mozhdeh Rouhsedaghat, Liunian Harold Li, Aichi Chien, C. -C. Jay Kuo, Fabien Scalzo, Kai-Wei Chang
Vision-and-language(V&L) models take image and text as input and learn to capture the associations between them.