TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Common Sense Reasoning	SWAG	ESIM + ELMo	Dev	59.1	# 2
Common Sense Reasoning	SWAG	ESIM + ELMo	Test	59.2	# 4
Common Sense Reasoning	SWAG	ESIM + GloVe	Dev	51.9	# 3
Common Sense Reasoning	SWAG	ESIM + GloVe	Test	52.7	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/swag-a-large-scale-adversarial-dataset-for/common-sense-reasoning-on-swag)](https://paperswithcode.com/sota/common-sense-reasoning-on-swag?p=swag-a-large-scale-adversarial-dataset-for)`

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

EMNLP 2018 · Rowan Zellers, Yonatan Bisk, Roy Schwartz, Yejin Choi ·

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). In this paper, we introduce the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning. We present SWAG, a new dataset with 113k multiple choice questions about a rich spectrum of grounded situations. To address the recurring challenges of the annotation artifacts and human biases found in many existing datasets, we propose Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data. To account for the aggressive adversarial filtering, we use state-of-the-art language models to massively oversample a diverse set of potential counterfactuals. Empirical results demonstrate that while humans can solve the resulting inference problems with high accuracy (88%), various competitive models struggle on our task. We provide comprehensive analysis that indicates significant opportunities for future research.

PDF Abstract EMNLP 2018 PDF EMNLP 2018 Abstract

Code

Add Remove Mark official

millenialSpirou/ift6010

Tasks

Add Remove

Common Sense Reasoning

Multiple-choice

Natural Language Inference

Question Answering

Datasets

Introduced in the Paper:

SWAG

Used in the Paper:

SNLI

ConceptNet

ActivityNet

COPA

ActivityNet Captions

Visual Madlibs

Results from the Paper

Edit

Ranked #4 on Common Sense Reasoning on SWAG

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Common Sense Reasoning	SWAG	ESIM + ELMo	Dev	59.1	# 2	Compare
Common Sense Reasoning	SWAG	ESIM + ELMo	Test	59.2	# 4	Compare
Common Sense Reasoning	SWAG	ESIM + GloVe	Dev	51.9	# 3	Compare
Common Sense Reasoning	SWAG	ESIM + GloVe	Test	52.7	# 5	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove