Constituency Parser with ELMo embeddings

Last updated on Mar 15, 2021

Constituency Parser with ELMo embeddings

Parameters 98 Million
Encoder Layers 2
File Size 677.88 MB
Training Data Penn Treebank
Training Resources
Training Time

Training Techniques AdaDelta
Architecture Convolution, Dropout, ELMo, Feedforward Network, Highway Layer, LSTM, Linear Layer, ReLU
LR 1
Epochs 150
Encoder Type LSTM
Encoder Layers 2
Encoder Input Size 1074
Encoder Hidden Size 250
Encoder Bidirectional True
SHOW MORE
SHOW LESS
README.md

Summary

This is an implementation of a minimal neural model for constituency parsing based on an independent scoring of labels and spans. This SpanConstituencyParser simply encodes a sequence of text with a stacked Seq2SeqEncoder, extracts span representations using a SpanExtractor, and then predicts a label for each span in the sequence. These labels are non-terminal nodes in a constituency parse tree, which we then greedily reconstruct. The model uses ELMo embeddings, which are completely character-based and improves single model performance from 92.6 F1 to 94.11 F1 on the Penn Treebank, a 20% relative error reduction.

Explore live Constituency Parsing demo at AllenNLP.

How do I load this model?

from allennlp_models.pretrained import load_predictor
predictor = load_predictor("structured-prediction-constituency-parser")

Getting predictions

sentence = "One morning I shot an elephant in my pajamas."
preds = predictor.predict(sentence)
print(preds["trees"])
# prints:
# (S (NP (CD One) (NN morning)) (NP (PRP I)) (VP (VBD shot) (NP (DT an) (NN elephant)) (PP (IN in) (NP (PRP$ my) (NNS pajamas)))) (. .))

You can also get predictions using allennlp command line interface:

echo '{"sentence": "One morning I shot an elephant in my pajamas."}' | \
    allennlp predict https://storage.googleapis.com/allennlp-public-models/elmo-constituency-parser-2020.02.10.tar.gz -

How do I train this model?

To train this model you can use allennlp CLI tool and the configuration file constituency_parser_elmo.jsonnet:

allennlp train constituency_parser_elmo.jsonnet -s output_dir

See the AllenNLP Training and prediction guide for more details.

Citation

@inproceedings{Joshi2018ExtendingAP,
 author = {V. Joshi and Matthew E. Peters and Mark Hopkins},
 booktitle = {ACL},
 title = {Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples},
 year = {2018}
}

Results

Constituency Parsing on Penn Treebank

Constituency Parsing
BENCHMARK MODEL METRIC NAME METRIC VALUE GLOBAL RANK
Penn Treebank Constituency Parser with ELMo embeddings F1 score 94.11 # 1