RoBERTa SNLI

Model Name:*

Description with Markdown (optional):

# Summary

This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler `Seq2VecEncoder` and then passed to a linear classification layer, which projects into the label space.

[Explore live Textual Entailment demo at AllenNLP](https://demo.allennlp.org/textual-entailment/roberta-snli).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-snli")
```

### Getting predictions

```python
premise = "It's a cat."
hypothesis = "It's Monday."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["probs"]):
    print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.08%
# p(contradiction) = 26.72%
# p(neutral) = 73.21%
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"premise": "It's a cat.", "hypothesis": "It's Monday."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/snli-roberta-2020-07-29.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/snli-roberta-2020-07-29.tar.gz \
    https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_test.jsonl
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [snli_roberta.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/pair_classification/snli_roberta.jsonnet):

```shell
allennlp train snli_roberta.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}
```

Paper:*

Code URL (optional):

LR	0.00002
Epochs	10
Dropout	0.1
Batch Size	32

ROBERTA

Training Techniques	AdamW
Architecture	Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR	0.00002
Epochs	10
Dropout	0.1
Batch Size	32
SHOW MORE
SHOW LESS

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation