Bidirectional Attention Flow for Machine Comprehension

5 Nov 2016  ·  Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi ·

Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. Recently, attention mechanisms have been successfully extended to MC. Typically these methods use attention to focus on a small portion of the context and summarize it with a fixed-size vector, couple attentions temporally, and/or often form a uni-directional attention. In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Our experimental evaluations show that our model achieves the state-of-the-art results in Stanford Question Answering Dataset (SQuAD) and CNN/DailyMail cloze test.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Question Answering CNN / Daily Mail BiDAF CNN 76.9 # 4
Daily Mail 79.6 # 2
Question Answering MS MARCO BiDaF Baseline Rouge-L 23.96 # 4
BLEU-1 10.64 # 4
Question Answering NarrativeQA BiDAF BLEU-1 33.45 # 8
BLEU-4 15.69 # 7
METEOR 15.68 # 7
Rouge-L 36.74 # 8
Open-Domain Question Answering Quasar BiDAF EM (Quasar-T) 25.9 # 6
F1 (Quasar-T) 28.5 # 5
Question Answering SQuAD1.1 BiDAF (ensemble) EM 73.744 # 130
F1 81.525 # 135
Hardware Burden 1G # 1
Operations per network pass None # 1
Question Answering SQuAD1.1 BiDAF (single model) EM 67.974 # 166
F1 77.323 # 168
Hardware Burden 1G # 1
Operations per network pass None # 1
Question Answering SQuAD1.1 dev BIDAF (single) EM 67.7 # 42
F1 77.3 # 45

Methods


No methods listed for this paper. Add relevant methods here