ELMo-based Decomposable Attention

Model Name:*

Description with Markdown (optional):

# Summary

This model implements the Decomposable Attention model described in [A Decomposable Attention Model for Natural Language Inference](https://api.semanticscholar.org/CorpusID:8495258) by Parikh et al., 2016, with some optional enhancements before the decomposable attention actually happens.  Parikh's original model allowed for computing an "intra-sentence" attention before doing the decomposable entailment step.  We generalize this to any `Seq2SeqEncoder` that can be applied to the premise and/or the hypothesis before computing entailment.

The basic outline of this model is to get an embedded representation of each word in the premise and
hypothesis, align words between the two, compare the aligned phrases, and make a final entailment decision based on this aggregated comparison.  Each step in this process uses a feedforward network to modify the representation.

This model uses ELMo embeddings.

[Explore live Textual Entailment demo at AllenNLP](https://demo.allennlp.org/textual-entailment/elmo-snli).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-decomposable-attention-elmo")
```

### Getting predictions

```python
premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.80%
# p(contradiction) = 90.98%
# p(neutral) = 8.21%
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/decomposable-attention-elmo-2020.04.09.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_test.jsonl
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [decomposable_attention_elmo.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/decomposable_attention_elmo.jsonnet):

```shell
allennlp train decomposable_attention_elmo.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Parikh2016ADA,
 author = {Ankur P. Parikh and Oscar T{"a}ckstr{"o}m and Dipanjan Das and Jakob Uszkoreit},
 journal = {ArXiv},
 title = {A Decomposable Attention Model for Natural Language Inference},
 volume = {abs/1606.01933},
 year = {2016}
}
```

Paper:*

Code URL (optional):

Epochs	140

Attached motifs:

FEEDFORWARD NETWORK

LSTM

ELMO

CONVOLUTION

HIGHWAY LAYER

RELU

DROPOUT

LINEAR LAYER

FEEDFORWARD NETWORK

Training Techniques	AdaGrad
Architecture	Convolution, Dropout, ELMo, Feedforward Network, Highway Layer, LSTM, Linear Layer, ReLU
Epochs	140

ELMo-based Decomposable Attention

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation