Transformer QA

Last updated on Mar 15, 2021

Transformer QA

Parameters 355 Million
File Size 1.25 GB
Training Data SQuAD

Training Techniques AdamW
Architecture Dropout, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR 2e-05
Epochs 5
Batch Size 16
SHOW MORE
SHOW LESS
README.md

Summary

The model implements a reading comprehension model patterned after the proposed model in BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al, 2018), with improvements borrowed from the SQuAD model in the transformers project. It predicts start tokens and end tokens with a linear layer on top of word piece embeddings.

Explore live Reading Comprehension demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("rc-transformer-qa")

Getting predictions

question = "Who graduated in 1936?"
passage = ("In 1932, Shannon entered the University of Michigan,"
    " where he was introduced to the work of George Boole. He"
    " graduated in 1936 with two bachelor's degrees: one in"
    " electrical engineering and the other in mathematics."
)
preds = predictor.predict(question, passage)
print(preds["best_span_str"])
# prints: Shannon

You can also get predictions using allennlp command line interface:

echo '{"question": "Who graduated in 1936?", "passage": "In 1932, Shannon entered the University..."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/transformer-qa.2021-02-11.tar.gz -

How do I evaluate this model?

To evaluate the model on SQuAD dev set run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/transformer-qa.2021-02-11.tar.gz \
    https://s3-us-west-2.amazonaws.com/allennlp/datasets/squad/squad-dev-v2.0.json

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file transformer_qa.jsonnet:

allennlp train transformer_qa.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and L. Zettlemoyer and V. Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}

Results

Question Answering on SQuAD1.1 dev

Question Answering
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
SQuAD1.1 dev Transformer QA EM 84 # 1
F1 88 # 1