RoBERTa MNLI

Model Name:*

Description with Markdown (optional):

# Summary

This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler `Seq2VecEncoder` and then passed to a linear classification layer, which projects into the label space.

[Explore live Textual Entailment demo at AllenNLP](https://demo.allennlp.org/textual-entailment/roberta-mnli).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-mnli")
```

### Getting predictions

```python
premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 1.50%
# p(contradiction) = 81.88%
# p(neutral) = 16.62%
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on Multi-genre Natural Language Inference (MultiNLI) dev set run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/multinli/multinli_1.0_dev_mismatched.jsonl
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [mnli_roberta.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/pair_classification/mnli_roberta.jsonnet):

```shell
allennlp train mnli_roberta.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}
```

Paper:*

Code URL (optional):

LR	0.0
Epochs	3
Dropout	0.1
Batch Size	16

ROBERTA

Training Techniques	AdamW
Architecture	Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR	0.0
Epochs	3
Dropout	0.1
Batch Size	16
SHOW MORE
SHOW LESS

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation