Alignment over Heterogeneous Embeddings for Question Answering

NAACL 2019 · Vikas Yadav, Steven Bethard, Mihai Surdeanu ·

We propose a simple, fast, and mostly-unsupervised approach for non-factoid question answering (QA) called Alignment over Heterogeneous Embeddings (AHE). AHE simply aligns each word in the question and candidate answer with the most similar word in the retrieved supporting paragraph, and weighs each alignment score with the inverse document frequency of the corresponding question/answer term. AHE{'}s similarity function operates over embeddings that model the underlying text at different levels of abstraction: character (FLAIR), word (BERT and GloVe), and sentence (InferSent), where the latter is the only supervised component in the proposed approach. Despite its simplicity and lack of supervision, AHE obtains a new state-of-the-art performance on the {``}Easy{''} partition of the AI2 Reasoning Challenge (ARC) dataset (64.6{\%} accuracy), top-two performance on the {``}Challenge{''} partition of ARC (34.1{\%}), and top-three performance on the WikiQA dataset (74.08{\%} MRR), outperforming many other complex, supervised approaches. Our error analysis indicates that alignments over character, word, and sentence embeddings capture substantially different semantic information. We exploit this with a simple meta-classifier that learns how much to trust the predictions over each representation, which further improves the performance of unsupervised AHE.

PDF Abstract