Entity Linking and Discovery via Arborescence-based Supervised Clustering

2 Sep 2021  ·  Dhruv Agarwal, Rico Angell, Nicholas Monath, Andrew McCallum ·

Previous work has shown promising results in performing entity linking by measuring not only the affinities between mentions and entities but also those amongst mentions. In this paper, we present novel training and inference procedures that fully utilize mention-to-mention affinities by building minimum arborescences (i.e., directed spanning trees) over mentions and entities across documents in order to make linking decisions. We also show that this method gracefully extends to entity discovery, enabling the clustering of mentions that do not have an associated entity in the knowledge base. We evaluate our approach on the Zero-Shot Entity Linking dataset and MedMentions, the largest publicly available biomedical dataset, and show significant improvements in performance for both entity linking and discovery compared to identically parameterized models. We further show significant efficiency improvements with only a small loss in accuracy over previous work, which use more computationally expensive models.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Entity Linking MedMentions ArboEL Accuracy 72.3 # 1
Recall@64 95.62 # 1
Entity Linking ZESHEL ArboEL Unnormalized Accuracy 50.4 # 1
Recall@64 85.11 # 1

Methods


No methods listed for this paper. Add relevant methods here