MuirBench is a benchmark containing 11,264 images and 2,600 multiple-choice questions, providing robust evaluation on 12 multi-image understanding tasks.

  • MuirBench evaluates on a comprehensive range of 12 multi-image understanding abilities, e.g. geographic understanding, diagram understanding, visual retrieval, ..., etc, while prior benchmarks generally contain single-image questions.

  • MuirBench contains 10 diverse multi-image relations, e.g. narrative, complementary, etc.

  • MuirBench provides a robust evaluation on models by unanswerable instance variants. Three major ways to create the unanswerable instances are as below.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages