Training Techniques | AdamW |
---|---|
Architecture | Dropout, Layer Normalization, Linear Layer, RoBERTa, Tanh |
LR | 0.00001 |
SHOW MORE |
This is a multiple choice model patterned after the BERT architecture. It calculates a score for each sequence on top of the CLS token, and then chooses the alternative with the highest score.
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("mc-roberta-piqa")
question = "To separate egg whites from the yolk using a water bottle, you should"
alternatives = [
"Squeeze the water bottle and press it against the yolk. Release, which creates suction and lifts the yolk.",
"Place the water bottle and press it against the yolk. Keep pushing, which creates suction and lifts the yolk."
]
preds = predictor.predict(question, alternatives)
print(alternatives[preds["best_alternative"]])
# prints: Place the water bottle and press it against the yolk. Keep pushing, which creates suction and lifts the yolk.
You can also get predictions using allennlp command line interface:
echo '{"prefix": "To separate egg whites from the yolk using a water bottle, you should",' \
'"alternatives": [' \
'"Squeeze the water bottle and press it against the yolk. Release, which creates suction and lifts the yolk.",' \
'"Place the water bottle and press it against the yolk. Keep pushing, which creates suction and lifts the yolk."' \
']}' | \
allennlp predict https://storage.googleapis.com/allennlp-public-models/piqa.2020-07-08.tar.gz -
To train this model you can use allennlp
CLI tool and the configuration file piqa.jsonnet:
allennlp train piqa.jsonnet -s output_dir
See the AllenNLP Training and prediction guide for more details.
@article{Liu2019RoBERTaAR,
author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and Luke Zettlemoyer and Veselin Stoyanov},
journal = {ArXiv},
title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
volume = {abs/1907.11692},
year = {2019}
}