TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	Accuracy	88.49	# 1
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	MRR	0.685	# 1
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	MR	9.17	# 1
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	Top-1 Accuracy	58.27	# 1
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	Top-3 Accuracy	76.51	# 1
Visual Question Answering (VQA)	ZS-F-VQA	SAN † - hard mask	Top-1 Accuracy	29.39	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zero-shot-visual-question-answering-using/visual-question-answering-on-f-vqa)](https://paperswithcode.com/sota/visual-question-answering-on-f-vqa?p=zero-shot-visual-question-answering-using)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/zero-shot-visual-question-answering-using/visual-question-answering-on-zs-f-vqa)](https://paperswithcode.com/sota/visual-question-answering-on-zs-f-vqa?p=zero-shot-visual-question-answering-using)`

Zero-shot Visual Question Answering using Knowledge Graph

12 Jul 2021 · Zhuo Chen, Jiaoyan Chen, Yuxia Geng, Jeff Z. Pan, Zonggang Yuan, Huajun Chen ·

Incorporating external knowledge to Visual Question Answering (VQA) has become a vital practical need. Existing methods mostly adopt pipeline approaches with different components for knowledge matching and extraction, feature learning, etc.However, such pipeline approaches suffer when some component does not perform well, which leads to error propagation and poor overall performance. Furthermore, the majority of existing approaches ignore the answer bias issue -- many answers may have never appeared during training (i.e., unseen answers) in real-word application. To bridge these gaps, in this paper, we propose a Zero-shot VQA algorithm using knowledge graphs and a mask-based learning mechanism for better incorporating external knowledge, and present new answer-based Zero-shot VQA splits for the F-VQA dataset. Experiments show that our method can achieve state-of-the-art performance in Zero-shot VQA with unseen answers, meanwhile dramatically augment existing end-to-end models on the normal F-VQA task.

PDF Abstract

Code

Add Remove Mark official

China-UK-ZSL/ZS-F-VQA official

Fangyin1994/KCL

Tasks

Add Remove

Knowledge Graphs

Question Answering

Visual Question Answering

Visual Question Answering (VQA)

Datasets

Introduced in the Paper:

ZS-F-VQA

Used in the Paper:

Visual Question Answering

OK-VQA

KVQA

Results from the Paper

Edit

Ranked #1 on Visual Question Answering (VQA) on F-VQA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Question Answering (VQA)	F-VQA	ZS-F-VQA	Accuracy	88.49	# 1	Compare
			MRR	0.685	# 1	Compare
			MR	9.17	# 1	Compare
			Top-1 Accuracy	58.27	# 1	Compare
			Top-3 Accuracy	76.51	# 1	Compare
Visual Question Answering (VQA)	ZS-F-VQA	SAN † - hard mask	Top-1 Accuracy	29.39	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Zero-shot Visual Question Answering using Knowledge Graph

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove