TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Situation Recognition	imSitu	CRF + Aug	Top-1 Verb	34.12	# 11
Situation Recognition	imSitu	CRF + Aug	Top-1 Verb & Value	26.45	# 11
Situation Recognition	imSitu	CRF + Aug	Top-5 Verbs	62.59	# 10
Situation Recognition	imSitu	CRF + Aug	Top-5 Verbs & Value	46.88	# 9
Grounded Situation Recognition	SWiG	CRF + Aug	Top-1 Verb	34.12	# 11
Grounded Situation Recognition	SWiG	CRF + Aug	Top-1 Verb & Value	26.45	# 11
Grounded Situation Recognition	SWiG	CRF + Aug	Top-5 Verbs	62.59	# 10
Grounded Situation Recognition	SWiG	CRF + Aug	Top-5 Verbs & Value	46.88	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/commonly-uncommon-semantic-sparsity-in/situation-recognition-on-imsitu)](https://paperswithcode.com/sota/situation-recognition-on-imsitu?p=commonly-uncommon-semantic-sparsity-in)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/commonly-uncommon-semantic-sparsity-in/grounded-situation-recognition-on-swig)](https://paperswithcode.com/sota/grounded-situation-recognition-on-swig?p=commonly-uncommon-semantic-sparsity-in)`

Commonly Uncommon: Semantic Sparsity in Situation Recognition

CVPR 2017 · Mark Yatskar, Vicente Ordonez, Luke Zettlemoyer, Ali Farhadi ·

Semantic sparsity is a common challenge in structured visual classification problems; when the output space is complex, the vast majority of the possible predictions are rarely, if ever, seen in the training set. This paper studies semantic sparsity in situation recognition, the task of producing structured summaries of what is happening in images, including activities, objects and the roles objects play within the activity. For this problem, we find empirically that most object-role combinations are rare, and current state-of-the-art models significantly underperform in this sparse data regime. We avoid many such errors by (1) introducing a novel tensor composition function that learns to share examples across role-noun combinations and (2) semantically augmenting our training data with automatically gathered examples of rarely observed outputs using web data. When integrated within a complete CRF-based structured prediction model, the tensor-based approach outperforms existing state of the art by a relative improvement of 2.11% and 4.40% on top-5 verb and noun-role accuracy, respectively. Adding 5 million images with our semantic augmentation techniques gives further relative improvements of 6.23% and 9.57% on top-5 verb and noun-role accuracy.

PDF Abstract CVPR 2017 PDF CVPR 2017 Abstract

Code

Add Remove Mark official

my89/imSitu

thilinicooray/my_imsitu

Tasks

Add Remove

Grounded Situation Recognition

Situation Recognition

Structured Prediction

Datasets

Visual Question Answering

Results from the Paper

Add Remove

Ranked #11 on Situation Recognition on imSitu

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Situation Recognition	imSitu	CRF + Aug	Top-1 Verb	34.12	# 11	Compare
			Top-1 Verb & Value	26.45	# 11	Compare
			Top-5 Verbs	62.59	# 10	Compare
			Top-5 Verbs & Value	46.88	# 9	Compare
Grounded Situation Recognition	SWiG	CRF + Aug	Top-1 Verb	34.12	# 11	Compare
			Top-1 Verb & Value	26.45	# 11	Compare
			Top-5 Verbs	62.59	# 10	Compare
			Top-5 Verbs & Value	46.88	# 9	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Commonly Uncommon: Semantic Sparsity in Situation Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove