Training Techniques | AdamW |
---|---|
Architecture | Dropout, Feedforward Network, Layer Normalization, Linear Layer, RoBERTa, Tanh |
LR | 0.0 |
SHOW MORE |
This model implements a basic text classifier. The text is embedded into a text field using a RoBERTa-large model. The resulting sequence is pooled using a cls_pooler Seq2VecEncoder
and then passed to a linear classification layer, which projects into the label space.
Explore live Textual Entailment demo at AllenNLP.
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("pair-classification-roberta-mnli")
premise = "A man in a black shirt overlooking bike maintenance."
hypothesis = "A man destroys a bike."
preds = predictor.predict(premise, hypothesis)
for label, prob in zip(labels, preds["label_probs"]):
print(f"p({label}) = {prob:.2%}")
# prints:
# p(entailment) = 1.50%
# p(contradiction) = 81.88%
# p(neutral) = 16.62%
You can also get predictions using allennlp command line interface:
echo '{"premise": "A man in a black shirt overlooking bike maintenance.", "hypothesis": "A man destroys a bike."}' | \
allennlp predict https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz -
To evaluate the model on Multi-genre Natural Language Inference (MultiNLI) dev set run:
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/mnli-roberta-2020-07-29.tar.gz \
https://allennlp.s3.amazonaws.com/datasets/multinli/multinli_1.0_dev_mismatched.jsonl
To train this model you can use allennlp
CLI tool and the configuration file mnli_roberta.jsonnet:
allennlp train mnli_roberta.jsonnet -s output_dir
See the AllenNLP Training and prediction guide for more details.
@article{Liu2019RoBERTaAR,
author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
journal = {ArXiv},
title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
volume = {abs/1907.11692},
year = {2019}
}