Training Techniques | AdaGrad |
---|---|
Architecture | Convolution, Dropout, ELMo, Feedforward Network, Highway Layer, LSTM, Linear Layer, ReLU |
Epochs | 140 |
This model implements the Decomposable Attention model described in A Decomposable Attention Model for Natural Language Inference by Parikh et al., 2016, with some optional enhancements before the decomposable attention actually happens. Parikh's original model allowed for computing an "intra-sentence" attention before doing the decomposable entailment step. We generalize this to any Seq2SeqEncoder
that can be applied to the premise and/or the hypothesis before computing entailment.
The basic outline of this model is to get an embedded representation of each word in the premise and hypothesis, align words between the two, compare the aligned phrases, and make a final entailment decision based on this aggregated comparison. Each step in this process uses a feedforward network to modify the representation.
This model uses ELMo embeddings.
Explore live Textual Entailment demo at AllenNLP.
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-decomposable-attention-elmo")
premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.80%
# p(contradiction) = 90.98%
# p(neutral) = 8.21%
You can also get predictions using allennlp command line interface:
echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
allennlp predict https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz -
To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz \
https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_test.jsonl
To train this model you can use allennlp
CLI tool and the configuration file decomposable_attention_elmo.jsonnet:
allennlp train decomposable_attention_elmo.jsonnet -s output_dir
See the AllenNLP Training and prediction guide for more details.
@article{Parikh2016ADA,
author = {Ankur P. Parikh and Oscar T{"a}ckstr{"o}m and Dipanjan Das and Jakob Uszkoreit},
journal = {ArXiv},
title = {A Decomposable Attention Model for Natural Language Inference},
volume = {abs/1606.01933},
year = {2016}
}