TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Question Answering (VQA)	OK-VQA	LaKo	Accuracy	47.01	# 21
Visual Question Answering (VQA)	OK-VQA	T5(Tan and Bansal, 2019) + Prefixes	Accuracy	42.03	# 25
Visual Question Answering (VQA)	VQA v2 test-dev	LaKo	Accuracy	68.07	# 36

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lako-knowledge-driven-visual-question/visual-question-answering-on-ok-vqa)](https://paperswithcode.com/sota/visual-question-answering-on-ok-vqa?p=lako-knowledge-driven-visual-question)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lako-knowledge-driven-visual-question/visual-question-answering-on-vqa-v2-test-dev)](https://paperswithcode.com/sota/visual-question-answering-on-vqa-v2-test-dev?p=lako-knowledge-driven-visual-question)`

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

26 Jul 2022 · Zhuo Chen, Yufeng Huang, Jiaoyan Chen, Yuxia Geng, Yin Fang, Jeff Pan, Ningyu Zhang, Wen Zhang ·

Visual question answering (VQA) often requires an understanding of visual concepts and language semantics, which relies on external knowledge. Most existing methods exploit pre-trained language models or/and unstructured text, but the knowledge in these resources are often incomplete and noisy. Some other methods prefer to use knowledge graphs (KGs) which often have intensive structured knowledge, but the research is still quite preliminary. In this paper, we propose LaKo, a knowledge-driven VQA method via Late Knowledge-to-text Injection. To effectively incorporate an external KG, we transfer triples into textual format and propose a late injection mechanism for knowledge fusion. Finally we address VQA as a text generation task with an effective encoder-decoder paradigm, which achieves state-of-the-art results on OKVQA dataset.

PDF Abstract

Code

Add Remove Mark official

hackerchenzhuo/LaKo official

Tasks

Add Remove

Knowledge Graphs

Question Answering

Text Generation

Visual Question Answering

Visual Question Answering (VQA)

Datasets

Visual Question Answering

ConceptNet

DBpedia

Visual Question Answering v2.0

OK-VQA

hasPart KB

Results from the Paper

Edit

Ranked #21 on Visual Question Answering (VQA) on OK-VQA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Question Answering (VQA)	OK-VQA	LaKo	Accuracy	47.01	# 21	Compare
Visual Question Answering (VQA)	OK-VQA	T5(Tan and Bansal, 2019) + Prefixes	Accuracy	42.03	# 25	Compare
Visual Question Answering (VQA)	VQA v2 test-dev	LaKo	Accuracy	68.07	# 36	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

LaKo: Knowledge-driven Visual Question Answering via Late Knowledge-to-Text Injection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove