Last updated on Mar 15, 2021


Parameters 356 Million
File Size 1.24 GB
Training Data SNLI

Training Techniques AdamW
Architecture Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR 2e-05
Epochs 10
Dropout 0.1
Batch Size 32


This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler Seq2VecEncoder and then passed to a linear classification layer, which projects into the label space.

Explore live Textual Entailment demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-snli")

Getting predictions

premise = "It's a cat."
hypothesis = "It's Monday."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.08%
# p(contradiction) = 26.72%
# p(neutral) = 73.21%

You can also get predictions using allennlp command line interface:

echo '{"premise": "It's a cat.", "hypothesis": "It's Monday."}' | \
    allennlp predict -

How do I evaluate this model?

To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:

allennlp evaluate \

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file snli_roberta.jsonnet:

allennlp train snli_roberta.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.


 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}