RoBERTa MNLI

Last updated on Mar 15, 2021

RoBERTa MNLI

Parameters 356 Million
File Size 1.26 GB
Training Data MultiNLI

Training Techniques AdamW
Architecture Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR 0.0
Epochs 3
Dropout 0.1
Batch Size 16
SHOW MORE
SHOW LESS
README.md

Summary

This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler Seq2VecEncoder and then passed to a linear classification layer, which projects into the label space.

Explore live Textual Entailment demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-mnli")

Getting predictions

premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 1.50%
# p(contradiction) = 81.88%
# p(neutral) = 16.62%

You can also get predictions using allennlp command line interface:

echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz -

How do I evaluate this model?

To evaluate the model on Multi-genre Natural Language Inference (MultiNLI) dev set run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/multinli/multinli_1.0_dev_mismatched.jsonl

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file mnli_roberta.jsonnet:

allennlp train mnli_roberta.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}