TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	Exact Match	23.45	# 3
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	BLEU-1	31.56	# 6
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	BLEU-4	12.04	# 3
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	ROUGE	34.34	# 6
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	METEOR	13.55	# 5
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	CIDEr	67.29	# 4
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	Exact Match	19.71	# 6
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	BLEU-1	29.46	# 7
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	BLEU-4	6.08	# 8
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	ROUGE	30.97	# 7
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	METEOR	12.07	# 7
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	CIDEr	58.23	# 7
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	Exact Match	20.56	# 5
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	BLEU-1	27.85	# 8
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	BLEU-4	7.46	# 7
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	ROUGE	30.68	# 8
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	METEOR	11.97	# 8
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	CIDEr	57.56	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scanqa-3d-question-answering-for-spatial/3d-question-answering-3d-qa-on-scanqa-test-w)](https://paperswithcode.com/sota/3d-question-answering-3d-qa-on-scanqa-test-w?p=scanqa-3d-question-answering-for-spatial)`

ScanQA: 3D Question Answering for Spatial Scene Understanding

CVPR 2022 · Daichi Azuma, Taiki Miyanishi, Shuhei Kurita, Motoaki Kawanabe ·

We propose a new 3D spatial understanding task of 3D Question Answering (3D-QA). In the 3D-QA task, models receive visual information from the entire 3D scene of the rich RGB-D indoor scan and answer the given textual questions about the 3D scene. Unlike the 2D-question answering of VQA, the conventional 2D-QA models suffer from problems with spatial understanding of object alignment and directions and fail the object identification from the textual questions in 3D-QA. We propose a baseline model for 3D-QA, named ScanQA model, where the model learns a fused descriptor from 3D object proposals and encoded sentence embeddings. This learned descriptor correlates the language expressions with the underlying geometric features of the 3D scan and facilitates the regression of 3D bounding boxes to determine described objects in textual questions and outputs correct answers. We collected human-edited question-answer pairs with free-form answers that are grounded to 3D objects in each 3D scene. Our new ScanQA dataset contains over 40K question-answer pairs from the 800 indoor scenes drawn from the ScanNet dataset. To the best of our knowledge, the proposed 3D-QA task is the first large-scale effort to perform object-grounded question-answering in 3D environments.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Code

Add Remove Mark official

atr-dbi/scanqa official

Tasks

Add Remove

3D Question Answering (3D-QA)

Object

Object Localization

Question Answering

Scene Understanding

Sentence

Sentence Embeddings

Visual Question Answering (VQA)

Datasets

Visual Question Answering

ScanNet

AI2-THOR

EQA ScanRefer Dataset

IQUAD

Results from the Paper

Add Remove

Ranked #3 on 3D Question Answering (3D-QA) on ScanQA Test w/ objects

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanQA	Exact Match	23.45	# 3	Compare
			BLEU-1	31.56	# 6	Compare
			BLEU-4	12.04	# 3	Compare
			ROUGE	34.34	# 6	Compare
			METEOR	13.55	# 5	Compare
			CIDEr	67.29	# 4	Compare
3D Question Answering (3D-QA)	ScanQA Test w/ objects	VoteNet+MCAN	Exact Match	19.71	# 6	Compare
			BLEU-1	29.46	# 7	Compare
			BLEU-4	6.08	# 8	Compare
			ROUGE	30.97	# 7	Compare
			METEOR	12.07	# 7	Compare
			CIDEr	58.23	# 7	Compare
3D Question Answering (3D-QA)	ScanQA Test w/ objects	ScanRefer+MCAN	Exact Match	20.56	# 5	Compare
			BLEU-1	27.85	# 8	Compare
			BLEU-4	7.46	# 7	Compare
			ROUGE	30.68	# 8	Compare
			METEOR	11.97	# 8	Compare
			CIDEr	57.56	# 8	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

ScanQA: 3D Question Answering for Spatial Scene Understanding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove