TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Dialog	VisDial v0.9 val	GNN	MRR	0.6285	# 14
Visual Dialog	VisDial v0.9 val	GNN	Mean Rank	4.57	# 11
Visual Dialog	VisDial v0.9 val	GNN	R@1	48.95	# 10
Visual Dialog	VisDial v0.9 val	GNN	R@10	88.36	# 11
Visual Dialog	VisDial v0.9 val	GNN	R@5	79.65	# 11
Visual Dialog	Visual Dialog v1.0 test-std	GNN	NDCG (x 100)	52.82	# 70
Visual Dialog	Visual Dialog v1.0 test-std	GNN	MRR (x 100)	61.37	# 42
Visual Dialog	Visual Dialog v1.0 test-std	GNN	R@1	47.33	# 42
Visual Dialog	Visual Dialog v1.0 test-std	GNN	R@5	77.98	# 42
Visual Dialog	Visual Dialog v1.0 test-std	GNN	R@10	87.83	# 41
Visual Dialog	Visual Dialog v1.0 test-std	GNN	Mean	4.57	# 39

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reasoning-visual-dialogs-with-structural-and/visual-dialog-on-visdial-v09-val)](https://paperswithcode.com/sota/visual-dialog-on-visdial-v09-val?p=reasoning-visual-dialogs-with-structural-and)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reasoning-visual-dialogs-with-structural-and/visual-dialog-on-visual-dialog-v1-0-test-std)](https://paperswithcode.com/sota/visual-dialog-on-visual-dialog-v1-0-test-std?p=reasoning-visual-dialogs-with-structural-and)`

Reasoning Visual Dialogs with Structural and Partial Observations

CVPR 2019 · Zilong Zheng, Wenguan Wang, Siyuan Qi, Song-Chun Zhu ·

We propose a novel model to address the task of Visual Dialog which exhibits complex dialog structures. To obtain a reasonable answer based on the current question and the dialog history, the underlying semantic dependencies between dialog entities are essential. In this paper, we explicitly formalize this task as inference in a graphical model with partially observed nodes and unknown graph structures (relations in dialog). The given dialog entities are viewed as the observed nodes. The answer to a given question is represented by a node with missing value. We first introduce an Expectation Maximization algorithm to infer both the underlying dialog structures and the missing node values (desired answers). Based on this, we proceed to propose a differentiable graph neural network (GNN) solution that approximates this process. Experiment results on the VisDial and VisDial-Q datasets show that our model outperforms comparative methods. It is also observed that our method can infer the underlying dialog structure for better dialog reasoning.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Code

Add Remove Mark official

zilongzheng/visdial-gnn official

Tasks

Add Remove

Visual Dialog

Datasets

MS COCO

Visual Question Answering

VisDial

Results from the Paper

Edit

Ranked #14 on Visual Dialog on VisDial v0.9 val

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Dialog	VisDial v0.9 val	GNN	MRR	0.6285	# 14	Compare
			Mean Rank	4.57	# 11	Compare
			R@1	48.95	# 10	Compare
			R@10	88.36	# 11	Compare
			R@5	79.65	# 11	Compare
Visual Dialog	Visual Dialog v1.0 test-std	GNN	NDCG (x 100)	52.82	# 70	Compare
			MRR (x 100)	61.37	# 42	Compare
			R@1	47.33	# 42	Compare
			R@5	77.98	# 42	Compare
			R@10	87.83	# 41	Compare
			Mean	4.57	# 39	Compare

Methods

Add Remove

Graph Neural Network

Edit Social Preview

Reasoning Visual Dialogs with Structural and Partial Observations

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove