TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Situation Recognition	imSitu	CRF	Top-1 Verb	32.34	# 12
Situation Recognition	imSitu	CRF	Top-1 Verb & Value	24.64	# 12
Situation Recognition	imSitu	CRF	Top-5 Verbs	58.88	# 12
Situation Recognition	imSitu	CRF	Top-5 Verbs & Value	42.76	# 12
Grounded Situation Recognition	SWiG	CRF	Top-1 Verb	32.34	# 12
Grounded Situation Recognition	SWiG	CRF	Top-1 Verb & Value	24.64	# 12
Grounded Situation Recognition	SWiG	CRF	Top-5 Verbs	58.88	# 12
Grounded Situation Recognition	SWiG	CRF	Top-5 Verbs & Value	42.76	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/situation-recognition-visual-semantic-role/situation-recognition-on-imsitu)](https://paperswithcode.com/sota/situation-recognition-on-imsitu?p=situation-recognition-visual-semantic-role)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/situation-recognition-visual-semantic-role/grounded-situation-recognition-on-swig)](https://paperswithcode.com/sota/grounded-situation-recognition-on-swig?p=situation-recognition-visual-semantic-role)`

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

CVPR 2016 · Mark Yatskar, Luke Zettlemoyer, Ali Farhadi ·

This paper introduces situation recognition, the problem of producing a concise summary of the situation an image depicts including: (1) the main activity (e.g., clipping), (2) the participating actors, objects, substances, and locations (e.g., man, shears, sheep, wool, and field) and most importantly (3) the roles these participants play in the activity (e.g., the man is clipping, the shears are his tool, the wool is being clipped from the sheep, and the clipping is in a field). We use FrameNet, a verb and role lexicon developed by linguists, to define a large space of possible situations and collect a large-scale dataset containing over 500 activities, 1,700 roles, 11,000 objects, 125,000 images, and 200,000 unique situations. We also introduce structured prediction baselines and show that, in activity-centric images, situation-driven prediction of objects and activities outperforms independent object and activity recognition.

PDF Abstract

Code

Add Remove Mark official

my89/imSitu official

Tasks

Add Remove

Activity Recognition

Grounded Situation Recognition

Semantic Role Labeling

Situation Recognition

Structured Prediction

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Ranked #12 on Situation Recognition on imSitu

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Situation Recognition	imSitu	CRF	Top-1 Verb	32.34	# 12	Compare
			Top-1 Verb & Value	24.64	# 12	Compare
			Top-5 Verbs	58.88	# 12	Compare
			Top-5 Verbs & Value	42.76	# 12	Compare
Grounded Situation Recognition	SWiG	CRF	Top-1 Verb	32.34	# 12	Compare
			Top-1 Verb & Value	24.64	# 12	Compare
			Top-5 Verbs	58.88	# 12	Compare
			Top-5 Verbs & Value	42.76	# 12	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Situation Recognition: Visual Semantic Role Labeling for Image Understanding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove