TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Visual Question Answering (VQA)	VQA v1 test-dev	SAAA (ResNet)	Accuracy	64.5	# 1
Visual Question Answering (VQA)	VQA v1 test-std	SAAA (ResNet)	Accuracy	64.6	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/show-ask-attend-and-answer-a-strong-baseline/visual-question-answering-on-vqa-v1-test-dev)](https://paperswithcode.com/sota/visual-question-answering-on-vqa-v1-test-dev?p=show-ask-attend-and-answer-a-strong-baseline)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/show-ask-attend-and-answer-a-strong-baseline/visual-question-answering-on-vqa-v1-test-std)](https://paperswithcode.com/sota/visual-question-answering-on-vqa-v1-test-std?p=show-ask-attend-and-answer-a-strong-baseline)`

Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering

11 Apr 2017 · Vahid Kazemi, Ali Elqursh ·

This paper presents a new baseline for visual question answering task. Given an image and a question in natural language, our model produces accurate answers according to the content of the image. Our model, while being architecturally simple and relatively small in terms of trainable parameters, sets a new state of the art on both unbalanced and balanced VQA benchmark. On VQA 1.0 open ended challenge, our model achieves 64.6% accuracy on the test-standard set without using additional data, an improvement of 0.4% over state of the art, and on newly released VQA 2.0, our model scores 59.7% on validation set outperforming best previously reported results by 0.5%. The results presented in this paper are especially interesting because very similar models have been tried before but significantly lower performance were reported. In light of the new results we hope to see more meaningful research on visual question answering in the future.

PDF Abstract

Code

Add Remove Mark official

Cyanogenoid/pytorch-vqa

237

guoyang9/vqa-prior

pramodkaushik/visual_qa_analysis

zixuwang1996/VQA-reading-list

Gunnika/Visual-Question-Answering

See all 12 implementations

Tasks

Add Remove

Visual Question Answering

Visual Question Answering (VQA)

Datasets

Visual Question Answering

Visual Question Answering v2.0

Results from the Paper

Edit

Ranked #1 on Visual Question Answering (VQA) on VQA v1 test-dev

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Result	Benchmark
Visual Question Answering (VQA)	VQA v1 test-dev	SAAA (ResNet)	Accuracy	64.5	# 1		Compare
Visual Question Answering (VQA)	VQA v1 test-std	SAAA (ResNet)	Accuracy	64.6	# 1		Compare

Methods

Add Remove

1x1 Convolution • Average Pooling • Batch Normalization • Bottleneck Residual Block • Convolution • Global Average Pooling • Kaiming Initialization • Max Pooling • ReLU • Residual Block • Residual Connection • ResNet

Edit Social Preview

Show, Ask, Attend, and Answer: A Strong Baseline For Visual Question Answering

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove