Training Techniques | AdamW |
---|---|
Architecture | Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh |
LR | 0.00002 |
SHOW MORE |
This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler Seq2VecEncoder
and then passed to a linear classification layer, which projects into the label space.
Explore live Textual Entailment demo at AllenNLP.
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-snli")
premise = "It's a cat."
hypothesis = "It's Monday."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["probs"]):
print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 0.08%
# p(contradiction) = 26.72%
# p(neutral) = 73.21%
You can also get predictions using allennlp command line interface:
echo '{"premise": "It's a cat.", "hypothesis": "It's Monday."}' | \
allennlp predict https://storage.googleapis.com/allennlp-public-models/snli-roberta-2020-07-29.tar.gz -
To evaluate the model on Stanford Natural Language Inference (SNLI) dev set run:
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/snli-roberta-2020-07-29.tar.gz \
https://allennlp.s3.amazonaws.com/datasets/snli/snli_1.0_test.jsonl
To train this model you can use allennlp
CLI tool and the configuration file snli_roberta.jsonnet:
allennlp train snli_roberta.jsonnet -s output_dir
See the AllenNLP Training and prediction guide for more details.
@article{Liu2019RoBERTaAR,
author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
journal = {ArXiv},
title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
volume = {abs/1907.11692},
year = {2019}
}