Transformer QA

Model Name:*

Description with Markdown (optional):

# Summary

The model implements a reading comprehension model patterned after the proposed model in [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al, 2018)](https://api.semanticscholar.org/CorpusID:52967399), with improvements borrowed from the SQuAD model in the transformers project. It predicts start tokens and end tokens with a linear layer on top of word piece embeddings.

[Explore live Reading Comprehension demo at AllenNLP](https://demo.allennlp.org/reading-comprehension/transformer-qa).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("rc-transformer-qa")
```

### Getting predictions

```python
question = "Who graduated in 1936?"
passage = ("In 1932, Shannon entered the University of Michigan,"
    " where he was introduced to the work of George Boole. He"
    " graduated in 1936 with two bachelor's degrees: one in"
    " electrical engineering and the other in mathematics."
)
preds = predictor.predict(question, passage)
print(preds["best_span_str"])
# prints: Shannon
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"question": "Who graduated in 1936?", "passage": "In 1932, Shannon entered the University..."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/transformer-qa.2021-02-11.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on SQuAD dev set run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/transformer-qa.2021-02-11.tar.gz \
    https://s3-us-west-2.amazonaws.com/allennlp/datasets/squad/squad-dev-v2.0.json
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [transformer_qa.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/rc/transformer_qa.jsonnet):

```shell
allennlp train transformer_qa.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@article{Liu2019RoBERTaAR,
 author = {Y. Liu and Myle Ott and Naman Goyal and Jingfei Du and Mandar Joshi and Danqi Chen and Omer Levy and M. Lewis and L. Zettlemoyer and V. Stoyanov},
 journal = {ArXiv},
 title = {RoBERTa: A Robustly Optimized BERT Pretraining Approach},
 volume = {abs/1907.11692},
 year = {2019}
}
```

Paper:*

Code URL (optional):

LR	0.00002
Epochs	5
Batch Size	16

ROBERTA

BENCHMARK	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
SQuAD1.1 dev	Transformer QA	EM	84	# 1
		F1	88	# 1

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation

Results

Question Answering on SQuAD1.1 dev

Question Answering

Training Techniques	AdamW
Architecture	Dropout, Layer Normalization, Linear Layer, RoBERTa, Tanh
LR	0.00002
Epochs	5
Batch Size	16
SHOW MORE
SHOW LESS