Entity Resolution

39 papers with code • 10 benchmarks • 11 datasets

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.


Use these libraries to find Entity Resolution models and implementations

Most implemented papers

d-blink: Distributed End-to-End Bayesian Entity Resolution

ngmarchant/dblink 13 Sep 2019

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.

Intermediate Training of BERT for Product Matching

weyoun2211/productbert-intermediate DI2KG: International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs @ VLDB 2020 2020

Adding the masked language modeling objective in the intermediate training step in order to further adapt the language model to the application domain leads to an additional increase of up to 3% F1.

A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Living-with-machines/DeezyMatch 17 Sep 2020

We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.

Can Foundation Models Wrangle Your Data?

hazyresearch/fm_data_tasks 20 May 2022

Foundation Models (FMs) are models trained on large corpora of data that, at very large scale, can generalize to new tasks without any task-specific finetuning.

Estimating the Performance of Entity Resolution Algorithms: Lessons Learned Through PatentsView.org

patentsview/patentsview-evaluation 3 Oct 2022

This paper introduces a novel evaluation methodology for entity resolution algorithms.

PIZZA: A new benchmark for complex end-to-end task-oriented parsing

amazon-science/pizza-semantic-parsing-dataset 1 Dec 2022

Much recent work in task-oriented parsing has focused on finding a middle ground between flat slots and intents, which are inexpressive but easy to annotate, and powerful representations such as the lambda calculus, which are expressive but costly to annotate.

A Practioner's Guide to Evaluating Entity Resolution Results

patentsview/patentsview-evaluation 14 Sep 2015

This paper provides practitioners the basic knowledge to begin evaluating their entity resolution results.

In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling

ngmarchant/oasis 2 Mar 2017

Entity resolution (ER) presents unique challenges for evaluation methodology.

Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation

anderbarrena/500kNED CONLL 2018

Named Entity Disambiguation algorithms typically learn a single model for all target entities.