Entity Resolution

26 papers with code • 7 benchmarks • 7 datasets

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.

Most implemented papers

d-blink: Distributed End-to-End Bayesian Entity Resolution

ngmarchant/dblink 13 Sep 2019

Entity resolution (ER; also known as record linkage or de-duplication) is the process of merging noisy databases, often in the absence of unique identifiers.

Intermediate Training of BERT for Product Matching

weyoun2211/productbert-intermediate DI2KG: International Workshop on Challenges and Experiences from Data Integration to Knowledge Graphs @ VLDB 2020 2020

Adding the masked language modeling objective in the intermediate training step in order to further adapt the language model to the application domain leads to an additional increase of up to 3% F1.

A Deep Learning Approach to Geographical Candidate Selection through Toponym Matching

Living-with-machines/DeezyMatch 17 Sep 2020

We report its performance on candidate selection in the context of the downstream task of toponym resolution, both on existing datasets and on a new manually-annotated resource of nineteenth-century English OCR'd text.

In Search of an Entity Resolution OASIS: Optimal Asymptotic Sequential Importance Sampling

ngmarchant/oasis 2 Mar 2017

Entity resolution (ER) presents unique challenges for evaluation methodology.

Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation

anderbarrena/500kNED CONLL 2018

Named Entity Disambiguation algorithms typically learn a single model for all target entities.

Crowdsourcing and Aggregating Nested Markable Annotations

juntaoy/dali-preprocessing-pipeline ACL 2019

One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.

Optimal Transport-based Alignment of Learned Character Representations for String Similarity

iesl/stance ACL 2019

We evaluate STANCE's ability to detect whether two strings can refer to the same entity--a task we term alias detection.

ZeroER: Entity Resolution using Zero Labeled Examples

chu-data-lab/zeroer 16 Aug 2019

We investigate an important problem that vexes practitioners: is it possible to design an effective algorithm for ER that requires Zero labeled examples, yet can achieve performance comparable to supervised approaches?

Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution

lokhande-vishnu/EntityResolution 12 Sep 2019

We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation.