ELMo-based Named Entity Recognition

Last updated on Mar 15, 2021

ELMo-based Named Entity Recognition

Parameters 98 Million
Encoder Layers 2
File Size 348.42 MB
Training Data CoNLL-2003
Training Resources
Training Time

Training Techniques Adam
Architecture CRF, Convolution, Dropout, ELMo, Highway Layer, LSTM, Linear Layer, ReLU
LR 0.001
Epochs 75
Dropout 0.5
Encoder Type LSTM
Encoder Layers 2
Encoder Input Size 1202
Encoder Hidden Size 200
Encoder Bidirectional True
SHOW MORE
SHOW LESS
README.md

Summary

This model is the baseline model described in Semi-supervised sequence tagging with bidirectional language models. It uses a Gated Recurrent Unit (GRU) character encoder as well as a GRU phrase encoder, and it starts with pretrained GloVe vectors for its token embeddings. It was trained on the CoNLL-2003 NER dataset.

Explore live Named Entity Recognition demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("tagging-elmo-crf-tagger")

Getting predictions

sentence = "Jobs and Wozniak cofounded Apple in 1976."
preds = predictor.predict(sentence)
for word, tag in zip(preds["words"], preds["tags"]):
    print(word, tag)
# prints:
# Jobs U-PER
# and O
# Wozniak U-PER
# cofounded O
# Apple U-ORG
# in O
# 1976 O
# . O

You can also get predictions using allennlp command line interface:

echo '{"sentence": "Jobs and Wozniak cofounded Apple in 1976."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz -

How do I evaluate this model?

To evaluate the model on CoNLL-2003 NER dataset run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz \
    path/to/dataset

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file ner_elmo.jsonnet:

allennlp train ner_elmo.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@inproceedings{Peters2017SemisupervisedST,
 author = {Matthew E. Peters and Waleed Ammar and Chandra Bhagavatula and R. Power},
 booktitle = {ACL},
 title = {Semi-supervised sequence tagging with bidirectional language models},
 year = {2017}
}

Results

Named Entity Recognition on CoNLL 2003 (English)

Named Entity Recognition
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
CoNLL 2003 (English) ELMo-based Named Entity Recognition F1 96 # 1