Reading Comprehension Models

Deep LSTM Reader

Introduced by Hermann et al. in Teaching Machines to Read and Comprehend

The Deep LSTM Reader is a neural network for reading comprehension. We feed documents one word at a time into a Deep LSTM encoder, after a delimiter we then also feed the query into the encoder. The model therefore processes each document query pair as a single long sequence. Given the embedded document and query the network predicts which token in the document answers the query.

The model consists of a Deep LSTM cell with skip connections from each input $x\left(t\right)$ to every hidden layer, and from every hidden layer to the output $y\left(t\right)$:

$$x'\left(t, k\right) = x\left(t\right)||y'\left(t, k - 1\right) \text{, } y\left(t\right) = y'\left(t, 1\right)|| \dots ||y'\left(t, K\right) $$

$$ i\left(t, k\right) = \left(W_{kxi}x'\left(t, k\right) + W_{khi}h(t - 1, k) + W_{kci}c\left(t - 1, k\right) + b_{ki}\right) $$

$$ f\left(t, k\right) = \left(W_{kxf}x\left(t\right) + W_{khf}h\left(t - 1, k\right) + W_{kcf}c\left(t - 1, k\right) + b_{kf}\right) $$

$$ c\left(t, k\right) = f\left(t, k\right)c\left(t - 1, k\right) + i\left(t, k\right)\text{tanh}\left(W_{kxc}x'\left(t, k\right) + W_{khc}h\left(t - 1, k\right) + b_{kc}\right) $$

$$ o\left(t, k\right) = \left(W_{kxo}x'\left(t, k\right) + W_{kho}h\left(t - 1, k\right) + W_{kco}c\left(t, k\right) + b_{ko}\right) $$

$$ h\left(t, k\right) = o\left(t, k\right)\text{tanh}\left(c\left(t, k\right)\right) $$

$$ y'\left(t, k\right) = W_{kyh}\left(t, k\right) + b_{ky} $$

where || indicates vector concatenation, $h\left(t, k\right)$ is the hidden state for layer $k$ at time $t$, and $i$, $f$, $o$ are the input, forget, and output gates respectively. Thus our Deep LSTM Reader is defined by $g^{\text{LSTM}}\left(d, q\right) = y\left(|d|+|q|\right)$ with input $x\left(t\right)$ the concatenation of $d$ and $q$ separated by the delimiter |||.

Source: Teaching Machines to Read and Comprehend


Paper Code Results Date Stars


Task Papers Share
Reading Comprehension 1 100.00%