Entity Resolution

49 papers with code • 10 benchmarks • 11 datasets

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.


Use these libraries to find Entity Resolution models and implementations


Most implemented papers

d-blink: Distributed End-to-End Bayesian Entity Resolution

ngmarchant/dblink 13 Sep 2019

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.

Intermediate Training of BERT for Product Matching

weyoun2211/productbert-intermediate DI2KG: International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs @ VLDB 2020 2020

Adding the masked language modeling objective in the intermediate training step in order to further adapt the language model to the application domain leads to an additional increase of up to 3% F1.

A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Living-with-machines/DeezyMatch 17 Sep 2020

We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.

Can Foundation Models Wrangle Your Data?

hazyresearch/fm_data_tasks 20 May 2022

Foundation Models (FMs) are models trained on large corpora of data that, at very large scale, can generalize to new tasks without any task-specific finetuning.

Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org

patentsview/patentsview-evaluation 3 Oct 2022

This paper introduces a novel evaluation methodology for entity resolution algorithms.

PIZZA: A new benchmark for complex end-to-end task-oriented parsing

amazon-science/pizza-semantic-parsing-dataset 1 Dec 2022

Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate.

How to Evaluate Entity Resolution Systems: An Entity-Centric Framework with Application to Inventor Name Disambiguation

olivierbinette/er-evaluation 8 Apr 2024

These benchmark data sets can then be used for model training and a variety of evaluation tasks.

A Practioner's Guide to Evaluating Entity Resolution Results

patentsview/patentsview-evaluation 14 Sep 2015

This paper provides practitioners the basic knowledge to begin evaluating their entity resolution results.

In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling

ngmarchant/oasis 2 Mar 2017

Entity resolution (ER) presents unique challenges for evaluation methodology.