Entity Resolution
54 papers with code • 11 benchmarks • 11 datasets
Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)
Surveys on entity resolution:
-
Christophides et al.: End-to-End Entity Resolution for Big Data: A Survey, 2020.
-
Barlaug and Gulla: Neural Networks for Entity Matching: A Survey, 2021.
The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.
Libraries
Use these libraries to find Entity Resolution models and implementationsDatasets
Most implemented papers
d-blink: Distributed End-to-End Bayesian Entity Resolution
Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.
Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org
This paper introduces a novel evaluation methodology for entity resolution algorithms.
Intermediate Training of BERT for Product Matching
Adding the masked language modeling objective in the intermediate training step in order to further adapt the language model to the application domain leads to an additional increase of up to 3% F1.
A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching
We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.
Can Foundation Models Wrangle Your Data?
Foundation Models (FMs) are models trained on large corpora of data that, at very large scale, can generalize to new tasks without any task-specific finetuning.
PIZZA: A new benchmark for complex end-to-end task-oriented parsing
Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate.
Towards Universal Dense Blocking for Entity Resolution
Blocking is a critical step in entity resolution, and the emergence of neural network-based representation models has led to the development of dense blocking as a promising approach for exploring deep semantics in blocking.
Match, Compare, or Select? An Investigation of Large Language Models for Entity Matching
Based on our findings, we further design a compound entity matching framework (ComEM) that leverages the composition of multiple strategies and LLMs.
A Practioner's Guide to Evaluating Entity Resolution Results
This paper provides practitioners the basic knowledge to begin evaluating their entity resolution results.
In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling
Entity resolution (ER) presents unique challenges for evaluation methodology.