RoBERTa Common Sense QA

Model Name:*

Description with Markdown (optional):

# Summary

This is a multiple choice model patterned after the BERT architecture. It calculates a score for each sequence on top of the CLS token, and then chooses the alternative with the highest score.

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("mc-roberta-commonsenseqa")
```

### Getting predictions

```python
question = "If I am tilting a drink toward my face, what should I do before the liquid spills over?"
alternatives = ["open mouth", "eat first", "use glass"]
preds = predictor.predict(question, alternatives)
print(alternatives[preds["best_alternative"]])
# prints: open mouth
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"prefix": "If I am tilting a drink toward my face, what should I do before the liquid spills over?",' \
    '"alternatives": ["open mouth", "eat first", "use glass"]}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/commonsenseqa.2020-07-08.tar.gz -
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [commonsenseqa.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/mc/commonsenseqa.jsonnet):

```shell
allennlp train commonsenseqa.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}
```

Paper:*

Code URL (optional):

LR	0.00001
Epochs	20

ROBERTA

Training Techniques	AdamW
Architecture	Dropout, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR	0.00001
Epochs	20
SHOW MORE
SHOW LESS

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I train this model?

Citation