ELMo-based Named Entity Recognition

Last updated on Mar 15, 2021

ELMo-based Named Entity Recognition

Parameters 98 Million
Encoder Layers 2
File Size 348.42 MB
Training Data CoNLL 2003
Training Resources
Training Time

Training Techniques Adam
Architecture CRF, Convolution, Dropout, ELMo, Highway Layer, LSTM, Linear Layer, ReLU
LR 0.001
Epochs 75
Dropout 0.5
Encoder Type LSTM
Encoder Layers 2
Encoder Input Size 1202
Encoder Hidden Size 200
Encoder Bidirectional True
SHOW MORE
SHOW LESS
README.md

Summary

This model is the baseline model described in Semi-supervised sequence tagging with bidirectional language models. It uses a Gated Recurrent Unit (GRU) character encoder as well as a GRU phrase encoder, and it starts with pretrained GloVe vectors for its token embeddings. It was trained on the CoNLL-2003 NER dataset.

Explore live Named Entity Recognition demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("tagging-elmo-crf-tagger")

Getting predictions

sentence = "Jobs and Wozniak cofounded Apple in 1976."
preds = predictor.predict(sentence)
for word, tag in zip(preds["words"], preds["tags"]):
    print(word, tag)
# prints:
# Jobs U-PER
# and O
# Wozniak U-PER
# cofounded O
# Apple U-ORG
# in O
# 1976 O
# . O

You can also get predictions using allennlp command line interface:

echo '{"sentence": "Jobs and Wozniak cofounded Apple in 1976."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz -

How do I evaluate this model?

To evaluate the model on CoNLL-2003 NER dataset run:

allennlp evaluate https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz \
    path/to/dataset

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file ner_elmo.jsonnet:

allennlp train ner_elmo.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@inproceedings{Peters2017SemisupervisedST,
 author = {Matthew E. Peters and Waleed Ammar and Chandra Bhagavatula and R. Power},
 booktitle = {ACL},
 title = {Semi-supervised sequence tagging with bidirectional language models},
 year = {2017}
}

Results

Named Entity Recognition on CoNLL 2003 (English)

Named Entity Recognition
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
CoNLL 2003 (English) ELMo-based Named Entity Recognition F1 96 # 1