IGLUE (Image-Grounded Language Understanding Evaluation)

Introduced by Bugliarello et al. in IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages

The Image-Grounded Language Understanding Evaluation (IGLUE) benchmark brings together—by both aggregating pre-existing datasets and creating new ones—visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. The benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups.


