TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Explanation Generation	CLEVR-X	PJ-X	B4	87.4	# 1
Explanation Generation	CLEVR-X	PJ-X	M	58.9	# 1
Explanation Generation	CLEVR-X	PJ-X	RL	93.4	# 1
Explanation Generation	CLEVR-X	PJ-X	C	639.8	# 1
Explanation Generation	CLEVR-X	PJ-X	Acc	63.0	# 2
Explanation Generation	CLEVR-X	FM	B4	78.8	# 2
Explanation Generation	CLEVR-X	FM	M	52.5	# 2
Explanation Generation	CLEVR-X	FM	RL	85.8	# 2
Explanation Generation	CLEVR-X	FM	C	566.8	# 2
Explanation Generation	CLEVR-X	FM	Acc	80.3	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/clevr-x-a-visual-reasoning-dataset-for/explanation-generation-on-clevr-x)](https://paperswithcode.com/sota/explanation-generation-on-clevr-x?p=clevr-x-a-visual-reasoning-dataset-for)`

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

5 Apr 2022 · Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata ·

Providing explanations in the context of Visual Question Answering (VQA) presents a fundamental problem in machine learning. To obtain detailed insights into the process of generating natural language explanations for VQA, we introduce the large-scale CLEVR-X dataset that extends the CLEVR dataset with natural language explanations. For each image-question pair in the CLEVR dataset, CLEVR-X contains multiple structured textual explanations which are derived from the original scene graphs. By construction, the CLEVR-X explanations are correct and describe the reasoning and visual information that is necessary to answer a given question. We conducted a user study to confirm that the ground-truth explanations in our proposed dataset are indeed complete and relevant. We present baseline results for generating natural language explanations in the context of VQA using two state-of-the-art frameworks on the CLEVR-X dataset. Furthermore, we provide a detailed analysis of the explanation generation quality for different question and answer types. Additionally, we study the influence of using different numbers of ground-truth explanations on the convergence of natural language generation (NLG) metrics. The CLEVR-X dataset is publicly available at \url{https://explainableml.github.io/CLEVR-X/}.

PDF Abstract

Code

Add Remove Mark official

explainableml/clevr-x official

Tasks

Add Remove

Explanation Generation

Question Answering

Text Generation

Visual Question Answering

Visual Question Answering (VQA)

Visual Reasoning

Datasets

Introduced in the Paper:

CLEVR-X

Used in the Paper:

Visual Question Answering

CLEVR

Visual Question Answering v2.0 SNLI-VE

VQA-E

e-SNLI-VE

Results from the Paper

Add Remove

Ranked #1 on Explanation Generation on CLEVR-X

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Explanation Generation	CLEVR-X	PJ-X	B4	87.4	# 1	Compare
			M	58.9	# 1	Compare
			RL	93.4	# 1	Compare
			C	639.8	# 1	Compare
			Acc	63.0	# 2	Compare
Explanation Generation	CLEVR-X	FM	B4	78.8	# 2	Compare
			M	52.5	# 2	Compare
			RL	85.8	# 2	Compare
			C	566.8	# 2	Compare
			Acc	80.3	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove