no code implementations • ICCV 2023 • Nitzan Bitton-Guetta, Yonatan Bitton, Jack Hessel, Ludwig Schmidt, Yuval Elovici, Gabriel Stanovsky, Roy Schwartz
We introduce WHOOPS!, a new dataset and benchmark for visual commonsense.
Ranked #1 on Image-to-Text Retrieval on WHOOPS! A Vision-and-Language Benchmark of Synthetic and Compositional Images (using extra training data)