ELMo-BiDAF

Last updated on Mar 15, 2021

ELMo-BiDAF

Parameters 113 Million
File Size 398.75 MB
Training Data SQuAD

Training Techniques Adam
Architecture Convolution, Dropout, ELMo, Highway Layer, LSTM, Linear Layer, ReLU
Epochs 20
Dropout 0.2
Batch Size 40
SHOW MORE
SHOW LESS
README.md

Summary

This is an implementation of the BiDAF model with ELMo embeddings. The basic layout is pretty simple: encode words as a combination of word embeddings and a character-level encoder, pass the word representations through a bi-LSTM/GRU, use a matrix of attentions to put question information into the passage word representations (this is the only part that is at all non-standard), pass this through another few layers of bi-LSTMs/GRUs, and do a softmax over span start and span end.

Explore live Reading Comprehension demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("rc-bidaf-elmo")

Getting predictions

question = "Who graduated in 1936?"
passage = ("In 1932, Shannon entered the University of Michigan,"
    " where he was introduced to the work of George Boole. He"
    " graduated in 1936 with two bachelor's degrees: one in"
    " electrical engineering and the other in mathematics."
)
preds = predictor.predict(question, passage)
print(preds["best_span_str"])
# prints: George Boole

You can also get predictions using allennlp command line interface:

echo '{"question": "Who graduated in 1936?", "passage": "In 1932, Shannon entered the University..."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/bidaf-elmo.2021-02-11.tar.gz -

How do I evaluate this model?

To evaluate the model on SQuAD dev set run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/bidaf-elmo.2021-02-11.tar.gz \
    https://s3-us-west-2.amazonaws.com/allennlp/datasets/squad/squad-dev-v1.1.json

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file bidaf_elmo.jsonnet:

allennlp train bidaf_elmo.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@article{Seo2017BidirectionalAF,
 author = {Minjoon Seo and Aniruddha Kembhavi and Ali Farhadi and Hannaneh Hajishirzi},
 journal = {ArXiv},
 title = {Bidirectional Attention Flow for Machine Comprehension},
 volume = {abs/1611.01603},
 year = {2017}
}

Results

Question Answering on SQuAD1.1 dev

Question Answering
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
SQuAD1.1 dev ELMo-BiDAF EM 71 # 2
F1 80 # 2