Entity Resolution

49 papers with code • 10 benchmarks • 11 datasets

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)

Surveys on entity resolution:

The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.

Benchmarks

Add a Result

These leaderboards are used to track progress in Entity Resolution

Dataset	Best Model	Compare
Amazon-Google	gpt4-0613_fewshot-10	See all
Abt-Buy	gpt4-0613_zeroshot	See all
WDC Computers-small	BERT	See all
WDC Computers-xlarge	RoBERTa-SupCon	See all
WDC Products-80%cc-seen-medium	gpt4-0613_zeroshot	See all
WDC Watches-small	HG	See all
WDC Products-50%cc-unseen-medium	RoBERTa-base	See all
WDC Watches-xlarge	JointBERT	See all
MusicBrainz20K	ALMSER-GB	See all
WDC Products-80%cc-seen-medium-multi	RoBERTa-SupCon	See all

Libraries

Use these libraries to find Entity Resolution models and implementations

megagonlabs/rotom

2 papers

Datasets

Subtasks

Blocking

Most implemented papers

Most implemented Social Latest No code

Learning Text Representations for 500K Classification Tasks on Named Entity Disambiguation

anderbarrena/500kNED • • CONLL 2018

Named Entity Disambiguation algorithms typically learn a single model for all target entities.

Paper
Code

Crowdsourcing and Aggregating Nested Markable Annotations

juntaoy/dali-preprocessing-pipeline • ACL 2019

One of the key steps in language resource creation is the identification of the text segments to be annotated, or markables, which depending on the task may vary from nominal chunks for named entity resolution to (potentially nested) noun phrases in coreference resolution (or mentions) to larger text segments in text segmentation.

Paper
Code

Optimal Transport-based Alignment of Learned Character Representations for String Similarity

iesl/stance • • ACL 2019

We evaluate STANCE's ability to detect whether two strings can refer to the same entity--a task we term alias detection.

Paper
Code

ZeroER: Entity Resolution using Zero Labeled Examples

chu-data-lab/zeroer • 16 Aug 2019

We investigate an important problem that vexes practitioners: is it possible to design an effective algorithm for ER that requires Zero labeled examples, yet can achieve performance comparable to supervised approaches?

Paper
Code

Accelerating Column Generation via Flexible Dual Optimal Inequalities with Application to Entity Resolution

lokhande-vishnu/EntityResolution • 12 Sep 2019

We tackle optimization of weighted set packing by relaxing integrality in our ILP formulation.

Paper
Code

AutoBlock: A Hands-off Blocking Framework for Entity Matching

vintasoftware/entity-embed • • 7 Dec 2019

Entity matching seeks to identify data records over one or multiple data sources that refer to the same real-world entity.

Paper
Code

Crowdsourced Collective Entity Resolution with Relational Match Propagation

nju-websoft/Remp • 21 Feb 2020

Knowledge bases (KBs) store rich yet heterogeneous entities and facts.

Paper
Code

Deep Entity Matching with Pre-Trained Language Models

megagonlabs/ditto • • 1 Apr 2020

Our experiments show that a straightforward application of language models such as BERT, DistilBERT, or RoBERTa pre-trained on large text corpora already significantly improves the matching quality and outperforms previous state-of-the-art (SOTA), by up to 29% of F1 score on benchmark datasets.

Paper
Code

Profiling Entity Matching Benchmark Tasks

wbsg-uni-mannheim/EntityMatchingTaskProfiler • International Conference on Information & Knowledge Management 2020

In order to enable the exact reproducibility of evaluation results, matching tasks need to contain exactly defined sets of matching and non-matching record pairs, as well as a fixed development and test split.

Paper
Code

Biomedical Named Entity Recognition at Scale

JohnSnowLabs/spark-nlp-workshop • 12 Nov 2020

Named entity recognition (NER) is a widely applicable natural language processing task and building block of question answering, topic modeling, information retrieval, etc.

Paper
Code

Entity Resolution

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result