Phase Conductor on Multi-layered Attentions for Machine Comprehension

ICLR 2018  ·  Rui Liu, Wei Wei, Weiguang Mao, Maria Chikina ·

Attention models have been intensively studied to improve NLP tasks such as machine comprehension via both question-aware passage attention model and self-matching attention model. Our research proposes phase conductor (PhaseCond) for attention models in two meaningful ways. First, PhaseCond, an architecture of multi-layered attention models, consists of multiple phases each implementing a stack of attention layers producing passage representations and a stack of inner or outer fusion layers regulating the information flow. Second, we extend and improve the dot-product attention function for PhaseCond by simultaneously encoding multiple question and passage embedding layers from different perspectives. We demonstrate the effectiveness of our proposed model PhaseCond on the SQuAD dataset, showing that our model significantly outperforms both state-of-the-art single-layered and multiple-layered attention models. We deepen our results with new findings via both detailed qualitative analysis and visualized examples showing the dynamic changes through multi-layered attention models.

PDF Abstract ICLR 2018 PDF ICLR 2018 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Question Answering SQuAD1.1 Conductor-net (ensemble) EM 76.996 # 100
F1 84.630 # 99
Hardware Burden None # 1
Operations per network pass None # 1
Question Answering SQuAD1.1 Conductor-net (single model) EM 74.405 # 123
F1 82.742 # 121
Hardware Burden None # 1
Operations per network pass None # 1
Question Answering SQuAD1.1 Conductor-net (single) EM 73.240 # 133
F1 81.933 # 128
Hardware Burden None # 1
Operations per network pass None # 1
Question Answering SQuAD1.1 dev PhaseCond (single) EM 72.1 # 32
F1 81.4 # 35

Methods