Entity Resolution
49 papers with code • 10 benchmarks • 11 datasets
Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)
Surveys on entity resolution:
-
Christophides et al.: End-to-End Entity Resolution for Big Data: A Survey, 2020.
-
Barlaug and Gulla: Neural Networks for Entity Matching: A Survey, 2021.
The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.
Libraries
Use these libraries to find Entity Resolution models and implementationsDatasets
Most implemented papers
Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation
Named Entity Disambiguation algorithms typically learn a single model for all target entities.
Crowdsourcing and Aggregating Nested Markable Annotations
One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.
Optimal Transport-based Alignment of Learned Character Representations for String Similarity
We evaluate STANCE's ability to detect whether two strings can refer to the same entity--a task we term alias detection.
ZeroER: Entity Resolution using Zero Labeled Examples
We investigate an important problem that vexes practitioners: is it possible to design an effective algorithm for ER that requires Zero labeled examples, yet can achieve performance comparable to supervised approaches?
Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution
We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation.
AutoBlock: A Hands-off Blocking Framework for Entity Matching
Entity matching seeks to identify data records over one or multiple data sources that refer to the same real-world entity.
Crowdsourced Collective Entity Resolution with Relational Match Propagation
Knowledge bases (KBs) store rich yet heterogeneous entities and facts.
Deep Entity Matching with Pre-Trained Language Models
Our experiments show that a straightforward application of language models such as BERT, DistilBERT, or RoBERTa pre-trained on large text corpora already significantly improves the matching quality and outperforms previous state-of-the-art (SOTA), by up to 29% of F1 score on benchmark datasets.
Profiling Entity Matching Benchmark Tasks
In order to enable the exact reproducibility of evaluation results, matching tasks need to contain exactly defined sets of matching and non-matching record pairs, as well as a fixed development and test split.
Biomedical Named Entity Recognition at Scale
Named entity recognition (NER) is a widely applicable natural language processing task and building block of question answering, topic modeling, information retrieval, etc.