ELMo-based Decomposable Attention

Last updated on Mar 15, 2021

ELMo-based Decomposable Attention

Parameters 94 Million
File Size 665.34 MB
Training Data SNLI

Training Techniques AdaGrad
Architecture Convolution, Dropout, ELMo, Feedforward Network, Highway Layer, LSTM, Linear Layer, ReLU
Epochs 140


This model implements the Decomposable Attention model described in A Decomposable Attention Model for Natural Language Inference by Parikh et al., 2016, with some optional enhancements before the decomposable attention actually happens. Parikh's original model allowed for computing an "intra-sentence" attention before doing the decomposable entailment step. We generalize this to any Seq2SeqEncoder that can be applied to the premise and/or the hypothesis before computing entailment.

The basic outline of this model is to get an embedded representation of each word in the premise and hypothesis, align words between the two, compare the aligned phrases, and make a final entailment decision based on this aggregated comparison. Each step in this process uses a feedforward network to modify the representation.

This model uses ELMo embeddings.

Explore live Textual Entailment demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-decomposable-attention-elmo")

Getting predictions

premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.80%
# p(contradiction) = 90.98%
# p(neutral) = 8.21%

You can also get predictions using allennlp command line interface:

echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz -

How do I evaluate this model?

To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz \

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file decomposable_attention_elmo.jsonnet:

allennlp train decomposable_attention_elmo.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.


 author = {Ankur P. Parikh and Oscar T{"a}ckstr{"o}m and Dipanjan Das and Jakob Uszkoreit},
 journal = {ArXiv},
 title = {A Decomposable Attention Model for Natural Language Inference},
 volume = {abs/1606.01933},
 year = {2016}