Training Techniques | AdamW |
---|---|
Architecture | BERT, Dropout, Feedforward Network, Layer Normalization, Linear Layer, ReLU, Sigmoid, Tanh |
LR | 0.0003 |
SHOW MORE |
The basic outline of this model is to get an embedded representation of each span in the document. These span representations are scored and used to prune away spans that are unlikely to occur in a coreference cluster. For the remaining spans, the model decides which antecedent span (if any) they are coreferent with. The resulting coreference links, after applying transitivity, imply a clustering of the spans in the document. The GloVe embeddings in the original paper have been substituted with SpanBERT embeddings.
Explore live Coreference Resolution demo at AllenNLP.
from allennlp_models.pretrained import load_predictor
predictor = load_predictor("coref-spanbert")
print(predictor.coref_resolved("The trophy doesn't fit in the brown suitcase because it is too big."))
# prints: The trophy doesn't fit in the brown suitcase because The trophy is too big.
You can also get predictions using allennlp command line interface:
echo '{"sentence": "The trophy doesn'\''t fit in the brown suitcase because it is too big."}' | \
allennlp predict https://storage.googleapis.com/allennlp-public-models/coref-spanbert-large-2020.02.27.tar.gz -
To train this model you can use allennlp
CLI tool and the configuration file coref_spanbert_large.jsonnet:
allennlp train coref_spanbert_large.jsonnet -s output_dir
See the AllenNLP Training and prediction guide for more details.
@inproceedings{Lee2018HigherorderCR,
author = {Kenton Lee and Luheng He and L. Zettlemoyer},
booktitle = {NAACL-HLT},
title = {Higher-order Coreference Resolution with Coarse-to-fine Inference},
year = {2018}
}