TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Question Answering	NarrativeQA	MHPGM + NOIC	BLEU-1	43.63	# 4
Question Answering	NarrativeQA	MHPGM + NOIC	BLEU-4	21.07	# 4
Question Answering	NarrativeQA	MHPGM + NOIC	METEOR	19.03	# 5
Question Answering	NarrativeQA	MHPGM + NOIC	Rouge-L	44.16	# 6
Question Answering	WikiHop	MHPGM + NOIC	Test	57.9	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/commonsense-for-generative-multi-hop-question/question-answering-on-narrativeqa)](https://paperswithcode.com/sota/question-answering-on-narrativeqa?p=commonsense-for-generative-multi-hop-question)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/commonsense-for-generative-multi-hop-question/question-answering-on-wikihop)](https://paperswithcode.com/sota/question-answering-on-wikihop?p=commonsense-for-generative-multi-hop-question)`

Commonsense for Generative Multi-Hop Question Answering Tasks

EMNLP 2018 · Lisa Bauer, Yicheng Wang, Mohit Bansal ·

Reading comprehension QA tasks have seen a recent surge in popularity, yet most works have focused on fact-finding extractive QA. We instead focus on a more challenging multi-hop generative task (NarrativeQA), which requires the model to reason, gather, and synthesize disjoint pieces of information within the context to generate an answer. This type of multi-step reasoning also often requires understanding implicit relations, which humans resolve via external, background commonsense knowledge. We first present a strong generative baseline that uses a multi-attention mechanism to perform multiple hops of reasoning and a pointer-generator decoder to synthesize the answer. This model performs substantially better than previous generative models, and is competitive with current state-of-the-art span prediction models. We next introduce a novel system for selecting grounded multi-hop relational commonsense information from ConceptNet via a pointwise mutual information and term-frequency based scoring function. Finally, we effectively use this extracted commonsense information to fill in gaps of reasoning between context hops, using a selectively-gated attention mechanism. This boosts the model's performance significantly (also verified via human evaluation), establishing a new state-of-the-art for the task. We also show promising initial results of the generalizability of our background knowledge enhancements by demonstrating some improvement on QAngaroo-WikiHop, another multi-hop reasoning dataset.

PDF Abstract EMNLP 2018 PDF EMNLP 2018 Abstract

Code

Add Remove Mark official

yicheng-w/CommonSenseMultiHopQA official

122

a414351664/NarrativeQA

Tasks

Add Remove

Implicit Relations

Multi-hop Question Answering

Question Answering

Reading Comprehension

Datasets

SQuAD

bAbI

NarrativeQA

WikiHop

Results from the Paper

Edit

Ranked #6 on Question Answering on NarrativeQA

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Question Answering	NarrativeQA	MHPGM + NOIC	BLEU-1	43.63	# 4	Compare
			BLEU-4	21.07	# 4	Compare
			METEOR	19.03	# 5	Compare
			Rouge-L	44.16	# 6	Compare
Question Answering	WikiHop	MHPGM + NOIC	Test	57.9	# 8	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Commonsense for Generative Multi-Hop Question Answering Tasks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove