The Color Dataset (CoDa) is a probing dataset to evaluate the representation of visual properties in language models. CoDa consists of color distributions for 521 common objects, which are split into 3 groups: Single, Multi, and Any.

The default configuration of CoDa uses 10 CLIP-style templates (e.g. "A photo of a [object]"), and 10 cloze-style templates (e.g. "Everyone knows most [object] are [color]."

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


Modalities


Languages