Referring Expression Comprehension

68 papers with code • 8 benchmarks • 8 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Referring Expression Comprehension

Dataset	Best Model	Compare
RefCOCO	UNINEXT-H	See all
Talk2Car	Udeer_HuBo-VLM	See all
RefCoco+	ONE-PEACE	See all
RefCOCOg-val	ONE-PEACE	See all
RefCOCOg-test	UNINEXT-H	See all
CLEVR-Ref+	MDETR	See all
GRIT	Unified-IOXL	See all
VQDv1	Vision+Query	See all

Libraries

Use these libraries to find Referring Expression Comprehension models and implementations

modelscope/modelscope

2 papers

6,156

Datasets

Most implemented papers

Most implemented Social Latest No code

Natural Language Object Retrieval

ronghanghu/natural-language-object-retrieval • • CVPR 2016

In this paper, we address the task of natural language object retrieval, to localize a target object within a given image based on a natural language query of the object.

Paper
Code

MAttNet: Modular Attention Network for Referring Expression Comprehension

lichengunc/MAttNet • • CVPR 2018

In this paper, we address referring expression comprehension: localizing an image region described by a natural language expression.

Paper
Code

Explainable Neural Computation via Stack Neural Module Networks

ronghanghu/snmn • • ECCV 2018

In complex inferential tasks like question answering, machine learning models must confront two challenges: the need to implement a compositional reasoning process, and, in many applications, the need for this reasoning process to be interpretable to assist users in both development and prediction.

Paper
Code

Language-Conditioned Graph Networks for Relational Reasoning

ronghanghu/lcgn • • ICCV 2019

E. g., conditioning on the "on" relationship to the plate, the object "mug" gathers messages from the object "plate" to update its representation to "mug on the plate", which can be easily consumed by a simple classifier for answer prediction.

Paper
Code

Talk2Car: Taking Control of Your Self-Driving Car

talk2car/Talk2Car • • IJCNLP 2019

Or more specifically, we consider the problem in an autonomous driving setting, where a passenger requests an action that can be associated with an object found in a street scene.

Paper
Code

A Real-time Global Inference Network for One-stage Referring Expression Comprehension

luogen1996/Real-time-Global-Inference-Network • • 7 Dec 2019

Referring Expression Comprehension (REC) is an emerging research spot in computer vision, which refers to detecting the target region in an image given an text description.

Paper
Code

Give Me Something to Eat: Referring Expression Comprehension with Commonsense Knowledge

zhanyang-nwpu/rsvg-pytorch • • 2 Jun 2020

In this case, we need to use commonsense knowledge to identify the objects in the image.

Paper
Code

AttnGrounder: Talking to Cars with Attention

i-m-vivek/AttnGrounder • • 11 Sep 2020

Visual grounding aims to localize a specific object in an image based on a given natural language text query.

Paper
Code

Cosine meets Softmax: A tough-to-beat baseline for visual grounding

niveditarufus/CMSVG • • 13 Sep 2020

In this paper, we present a simple baseline for visual grounding for autonomous driving which outperforms the state of the art methods, while retaining minimal design choices.

Paper
Code

Language-Conditioned Feature Pyramids for Visual Selection Tasks

Alab-NII/lcfp • • Findings of the Association for Computational Linguistics 2020

However, few models consider the fusion of linguistic features with multiple visual features with different sizes of receptive fields, though the proper size of the receptive field of visual features intuitively varies depending on expressions.

Paper
Code

Referring Expression Comprehension

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result