2 code implementations • NeurIPS 2023 • Wisdom Oluchi Ikezogwo, Mehmet Saygin Seyfioglu, Fatemeh Ghezloo, Dylan Stefan Chan Geva, Fatwir Sheikh Mohammed, Pavan Kumar Anand, Ranjay Krishna, Linda Shapiro
From YouTube, we curate QUILT: a large-scale vision-language dataset consisting of $802, 144$ image and text pairs.