TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Question Answering (VQA)	CLEVR	single-hop + LCGN (ours)	Accuracy	97.9	# 9
Referring Expression Comprehension	CLEVR-Ref+	GroundeR + LCGN (ours)	Accuracy	74.8	# 3
Visual Question Answering (VQA)	GQA test-dev	single-hop + LCGN (ours)	Accuracy	55.8	# 6
Visual Question Answering (VQA)	GQA test-std	single-hop + LCGN (ours)	Accuracy	56.1	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/language-conditioned-graph-networks-for/referring-expression-comprehension-on-clevr)](https://paperswithcode.com/sota/referring-expression-comprehension-on-clevr?p=language-conditioned-graph-networks-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/language-conditioned-graph-networks-for/visual-question-answering-on-gqa-test-std)](https://paperswithcode.com/sota/visual-question-answering-on-gqa-test-std?p=language-conditioned-graph-networks-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/language-conditioned-graph-networks-for/visual-question-answering-on-gqa-test-dev)](https://paperswithcode.com/sota/visual-question-answering-on-gqa-test-dev?p=language-conditioned-graph-networks-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/language-conditioned-graph-networks-for/visual-question-answering-on-clevr)](https://paperswithcode.com/sota/visual-question-answering-on-clevr?p=language-conditioned-graph-networks-for)`

Language-Conditioned Graph Networks for Relational Reasoning

ICCV 2019 · Ronghang Hu, Anna Rohrbach, Trevor Darrell, Kate Saenko ·

Solving grounded language tasks often requires reasoning about relationships between objects in the context of a given task. For example, to answer the question "What color is the mug on the plate?" we must check the color of the specific mug that satisfies the "on" relationship with respect to the plate. Recent work has proposed various methods capable of complex relational reasoning. However, most of their power is in the inference structure, while the scene is represented with simple local appearance features. In this paper, we take an alternate approach and build contextualized representations for objects in a visual scene to support relational reasoning. We propose a general framework of Language-Conditioned Graph Networks (LCGN), where each node represents an object, and is described by a context-aware representation from related objects through iterative message passing conditioned on the textual input. E.g., conditioning on the "on" relationship to the plate, the object "mug" gathers messages from the object "plate" to update its representation to "mug on the plate", which can be easily consumed by a simple classifier for answer prediction. We experimentally show that our LCGN approach effectively supports relational reasoning and improves performance across several tasks and datasets. Our code is available at http://ronghanghu.com/lcgn.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract

Code

Add Remove Mark official

ronghanghu/lcgn

Tasks

Add Remove

Object

Referring Expression Comprehension

Relational Reasoning

Visual Question Answering (VQA)

Datasets

MS COCO

Visual Question Answering

Visual Genome

CLEVR

GQA

RefCOCO

SHAPES

SUTD-TrafficQA CLEVR-Ref+

Results from the Paper

Edit

Ranked #3 on Referring Expression Comprehension on CLEVR-Ref+

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Question Answering (VQA)	CLEVR	single-hop + LCGN (ours)	Accuracy	97.9	# 9	Compare
Referring Expression Comprehension	CLEVR-Ref+	GroundeR + LCGN (ours)	Accuracy	74.8	# 3	Compare
Visual Question Answering (VQA)	GQA test-dev	single-hop + LCGN (ours)	Accuracy	55.8	# 6	Compare
Visual Question Answering (VQA)	GQA test-std	single-hop + LCGN (ours)	Accuracy	56.1	# 5	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Language-Conditioned Graph Networks for Relational Reasoning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove