ELMo-based Named Entity Recognition

Model Name:*

Description with Markdown (optional):

# Summary

This model is the baseline model described in [Semi-supervised sequence tagging with bidirectional language models](https://api.semanticscholar.org/CorpusID:7197241). It uses a Gated Recurrent Unit (GRU) character encoder as well as a GRU phrase encoder, and it starts with pretrained GloVe vectors for its token embeddings. It was trained on the CoNLL-2003 NER dataset.

[Explore live Named Entity Recognition demo at AllenNLP](https://demo.allennlp.org/named-entity-recognition/named-entity-recognition).

## How do I load this model?

```python
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("tagging-elmo-crf-tagger")
```

### Getting predictions

```python
sentence = "Jobs and Wozniak cofounded Apple in 1976."
preds = predictor.predict(sentence)
for word, tag in zip(preds["words"], preds["tags"]):
    print(word, tag)
# prints:
# Jobs U-PER
# and O
# Wozniak U-PER
# cofounded O
# Apple U-ORG
# in O
# 1976 O
# . O
```

You can also get predictions using allennlp command line interface:

```shell
echo '{"sentence": "Jobs and Wozniak cofounded Apple in 1976."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz -
```

## How do I evaluate this model?
To evaluate the model on CoNLL-2003 NER dataset run:

```shell
allennlp evaluate https://storage.googleapis.com/allennlp-public-models/ner-elmo.2021-02-12.tar.gz \
    path/to/dataset
```

## How do I train this model?

To train this model you can use `allennlp` CLI tool and the configuration file [ner_elmo.jsonnet](https://raw.githubusercontent.com/allenai/allennlp-models/v2.1.0/training_config/tagging/ner_elmo.jsonnet):

```shell
allennlp train ner_elmo.jsonnet -s output_dir
```

See the [AllenNLP Training and prediction](https://guide.allennlp.org/training-and-prediction#2) guide for more details.

## Citation

```bibtex
@inproceedings{Peters2017SemisupervisedST,
 author = {Matthew E. Peters and Waleed Ammar and Chandra Bhagavatula and R. Power},
 booktitle = {ACL},
 title = {Semi-supervised sequence tagging with bidirectional language models},
 year = {2017}
}
```

Paper:*

Code URL (optional):

LR	0.001
Epochs	75
Dropout	0.5
Encoder Type	LSTM
Encoder Layers	2
Encoder Input Size	1202
Encoder Hidden Size	200
Encoder Bidirectional	True

Attached motifs:

LSTM

CRF

ELMO

CONVOLUTION

HIGHWAY LAYER

RELU

DROPOUT

LINEAR LAYER

LSTM

ELMo-based Named Entity Recognition

allenai / allennlp

Summary

How do I load this model?

Getting predictions

How do I evaluate this model?

How do I train this model?

Citation

Results

Named Entity Recognition on CoNLL 2003 (English)

Named Entity Recognition

Training Techniques	Adam
Architecture	CRF, Convolution, Dropout, ELMo, Highway Layer, LSTM, Linear Layer, ReLU
LR	0.001
Epochs	75
Dropout	0.5
Encoder Type	LSTM
Encoder Layers	2
Encoder Input Size	1202
Encoder Hidden Size	200
Encoder Bidirectional	True
SHOW MORE
SHOW LESS