TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Natural Language Understanding	PDP60	Word-level CNN+LSTM (full scoring)	Accuracy	60.0	# 9
Natural Language Understanding	PDP60	Word-level CNN+LSTM (partial scoring)	Accuracy	53.3	# 12
Coreference Resolution	Winograd Schema Challenge	Ensemble of 14 LMs	Accuracy	63.7	# 45
Coreference Resolution	Winograd Schema Challenge	Word-level CNN+LSTM (partial scoring)	Accuracy	62.6	# 49
Coreference Resolution	Winograd Schema Challenge	Char-level CNN+LSTM (partial scoring)	Accuracy	57.9	# 64

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-simple-method-for-commonsense-reasoning/natural-language-understanding-on-pdp60)](https://paperswithcode.com/sota/natural-language-understanding-on-pdp60?p=a-simple-method-for-commonsense-reasoning)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-simple-method-for-commonsense-reasoning/coreference-resolution-on-winograd-schema)](https://paperswithcode.com/sota/coreference-resolution-on-winograd-schema?p=a-simple-method-for-commonsense-reasoning)`

A Simple Method for Commonsense Reasoning

7 Jun 2018 · Trieu H. Trinh, Quoc V. Le ·

Commonsense reasoning is a long-standing challenge for deep learning. For example, it is difficult to use neural networks to tackle the Winograd Schema dataset (Levesque et al., 2011). In this paper, we present a simple method for commonsense reasoning with neural networks, using unsupervised learning. Key to our method is the use of language models, trained on a massive amount of unlabled data, to score multiple choice questions posed by commonsense reasoning tests. On both Pronoun Disambiguation and Winograd Schema challenges, our models outperform previous state-of-the-art methods by a large margin, without using expensive annotated knowledge bases or hand-engineered features. We train an array of large RNN language models that operate at word or character level on LM-1-Billion, CommonCrawl, SQuAD, Gutenberg Books, and a customized corpus for this task and show that diversity of training data plays an important role in test performance. Further analysis also shows that our system successfully discovers important features of the context that decide the correct answer, indicating a good grasp of commonsense knowledge.

PDF Abstract

Code

Add Remove Mark official

tensorflow/models

65,338

gabimelo/portuguese_wsc

Tasks

Add Remove

Common Sense Reasoning

Coreference Resolution

Multiple-choice

Natural Language Understanding

Datasets

Introduced in the Paper:

CC-Stories

Used in the Paper:

SQuAD

WSC

Results from the Paper

Edit

Ranked #9 on Natural Language Understanding on PDP60

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Natural Language Understanding	PDP60	Word-level CNN+LSTM (full scoring)	Accuracy	60.0	# 9	Compare
Natural Language Understanding	PDP60	Word-level CNN+LSTM (partial scoring)	Accuracy	53.3	# 12	Compare
Coreference Resolution	Winograd Schema Challenge	Ensemble of 14 LMs	Accuracy	63.7	# 45	Compare
Coreference Resolution	Winograd Schema Challenge	Word-level CNN+LSTM (partial scoring)	Accuracy	62.6	# 49	Compare
Coreference Resolution	Winograd Schema Challenge	Char-level CNN+LSTM (partial scoring)	Accuracy	57.9	# 64	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

A Simple Method for Commonsense Reasoning

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove