Iterative Hierarchical Attention for Answering Complex Questions over Long Documents

1 Jun 2021  ·  Haitian Sun, William W. Cohen, Ruslan Salakhutdinov ·

We propose a new model, DocHopper, that iteratively attends to different parts of long, hierarchically structured documents to answer complex questions. Similar to multi-hop question-answering (QA) systems, at each step, DocHopper uses a query $q$ to attend to information from a document, combines this ``retrieved'' information with $q$ to produce the next query. However, in contrast to most previous multi-hop QA systems, DocHopper is able to ``retrieve'' either short passages or long sections of the document, thus emulating a multi-step process of ``navigating'' through a long document to answer a question. To enable this novel behavior, DocHopper does not combine document information with $q$ by concatenating text to the text of $q$, but by combining a compact neural representation of $q$ with a compact neural representation of a hierarchical part of the document, which can potentially be quite large. We experiment with DocHopper on four different QA tasks that require reading long and complex documents to answer multi-hop questions, and show that DocHopper achieves state-of-the-art results on three of the datasets. Additionally, DocHopper is efficient at inference time, being 3--10 times faster than the baselines.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Question Answering ConditionalQA DocHopper Conditional (answers) 42.0 / 46.4 # 2
Conditional (w/ conditions) 3.1 / 3.8 # 2
Overall (answers) 40.6 / 45.2 # 2
Overall (w/ conditions) 31.9 / 36.0 # 2
Question Answering HybridQA DocHopper ANS-EM 46.3 # 3

Methods