GRIT (General Robust Image Task Benchmark)

Introduced by Gupta et al. in GRIT: General Robust Image Task Benchmark

The General Robust Image Task (GRIT) Benchmark is an evaluation-only benchmark for evaluating the performance and robustness of vision systems across multiple image prediction tasks, concepts, and data sources. GRIT hopes to encourage our research community to pursue the following research directions:

General purpose vision models - GRIT facilitates the evaluation of unified and general-purpose vision models that demonstrate a wide range of skills across a diverse set of concepts.
Robust specialized models - GRIT simplifies and unifies quantification of misinformation, calibration, and generalization under distribution shifts due to novel concepts, novel data sources or image distortions for 7 standard vision and vision-language tasks.
Efficient learning - GRIT includes a restricted and an unrestricted track. The restrictedtrack constrains the allowed training data to a selected but rich set of data sources that allows more scientific and meaningful comparison between models. This is meant to encourage resource constrained researchers to participate in the GRIT challenge and to spur interest in efficient learning methods as opposed to the dominant paradigm of training larger models on ever increasing amounts of training data. The unrestricted track allows much more flexibility in training data selection to test the capability of vision models trained with massive data and compute.

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Object Categorization	GRIT	Unified-IOXL
Object Localization	GRIT	Unified-IOXL
Referring Expression Comprehension	GRIT	Unified-IOXL
Surface Normal Estimation	GRIT	NLL-AngMF
Object Segmentation	GRIT	Unified-IOXL
Visual Question Answering (VQA)	GRIT	Unified-IOXL
Keypoint Estimation	GRIT	Mask R-CNN
Visual Question Answering	GRIT	OFA