RefCOCO

Introduced by Kazemzadeh et al. in ReferItGame: Referring to Objects in Photographs of Natural Scenes

The RefCOCO dataset is a referring expression generation (REG) dataset used for tasks related to understanding natural language expressions that refer to specific objects in images. Here are the key details about RefCOCO:

Collection Method: The dataset was collected using the ReferitGame, a two-player game. In this game, the first player views an image with a segmented target object and writes a natural language expression referring to that object. The second player sees only the image and the referring expression and must click on the corresponding object. If both players perform correctly, they earn points and switch roles; otherwise, they receive a new object and image for description.
Dataset Variants: RefCOCO: Contains 142,209 refer expressions for 50,000 objects across 19,994 images. RefCOCO+: Includes 141,564 expressions for 49,856 objects in 19,992 images. RefCOCOg: This variant has 25,799 images, 95,010 referring expressions, and 49,822 object instances.
Language and Restrictions: RefCOCO allows any type of language in the referring expressions. RefCOCO+ disallows location words in expressions to focus purely on appearance-based descriptions (e.g., "the man in the yellow polka-dotted shirt") rather than viewer-dependent descriptions (e.g., "the second man from the left").

These datasets serve as valuable resources for tasks like referring expression segmentation, comprehension, and visual grounding in computer vision research.

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Referring Expression Segmentation	RefCoCo val	HIPIE
Referring Expression Segmentation	RefCOCO testA	UNINEXT-H
Referring Expression Segmentation	RefCOCO+ val	HIPIE
Referring Expression Segmentation	RefCOCO+ test B	UniLSeg-100
Referring Expression Segmentation	RefCOCO testB	UNINEXT-H
Referring Expression Segmentation	RefCOCO+ testA	UniLSeg-100
Referring Expression Comprehension	RefCOCO	UNINEXT-H
Referring Expression Segmentation	RefCoCo val	UNINEXT-H
Referring Expression Segmentation	RefCOCOg-val	UniLSeg-100
Referring Expression Comprehension	RefCoco+	ONE-PEACE
Referring Expression Comprehension	RefCOCOg-val	ONE-PEACE
Referring Expression Comprehension	RefCOCOg-test	UNINEXT-H
Referring Expression Segmentation	RefCOCOg-test	UniLSeg-100
Visual Grounding	RefCOCO+ test B	mPLUG-2
Visual Grounding	RefCOCO+ val	mPLUG-2
Visual Grounding	RefCOCO+ testA	mPLUG-2
Referring Expression Segmentation	RefCOCO	GLEE-Pro
Referring Expression Segmentation	RefCOCO testA	EVP
Referring Expression Segmentation	RefCOCO testB	EVP