Entity Resolution
50 papers with code • 10 benchmarks • 11 datasets
Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia)
Surveys on entity resolution:
-
Christophides et al.: End-to-End Entity Resolution for Big Data: A Survey, 2020.
-
Barlaug and Gulla: Neural Networks for Entity Matching: A Survey, 2021.
The task of entity resolution is closely related to the task of entity alignment which focuses on matching entities between knowledge bases. The task of entity linking differs from entity resolution as entity linking focuses on identifying entity mentions in free text.
Libraries
Use these libraries to find Entity Resolution models and implementationsDatasets
Latest papers with no code
Combining Global and Local Merges in Logic-based Entity Resolution
In the recently proposed Lace framework for collective entity resolution, logical rules and constraints are used to identify pairs of entity references (e. g. author or paper ids) that denote the same entity.
Beyond Rule-based Named Entity Recognition and Relation Extraction for Process Model Generation from Natural Language Text
We propose an extension to the PET dataset that incorporates information about linguistic references and a corresponding method for resolving them.
A Framework for Combining Entity Resolution and Query Answering in Knowledge Bases
We propose a new framework for combining entity resolution and query answering in knowledge bases (KBs) with tuple-generating dependencies (tgds) and equality-generating dependencies (egds) as rules.
Another Generic Setting for Entity Resolution: Basic Theory
They treated the functions for matching and merging entity records as black-boxes and introduced four important properties that enable efficient generic ER algorithms.
KAER: A Knowledge Augmented Pre-Trained Language Model for Entity Resolution
Entity resolution has been an essential and well-studied task in data cleaning research for decades.
Introducing Semantics into Speech Encoders
Recent studies find existing self-supervised speech encoders contain primarily acoustic rather than semantic information.
Low-cost Relevance Generation and Evaluation Metrics for Entity Resolution in AI
Relevance generation and 2.
Bridging the Gap between Reality and Ideality of Entity Matching: A Revisiting and Benchmark Re-Construction
Experimental results demonstrate that the assumptions made in the previous benchmark construction process are not coincidental with the open environment, which conceal the main challenges of the task and therefore significantly overestimate the current progress of entity matching.
A Survey on Efficient Processing of Similarity Queries over Neural Embeddings
Embedding techniques work by representing the raw data objects as vectors (so called "embeddings" or "neural embeddings" since they are mostly generated by neural network models) that expose the hidden semantics of the raw data, based on which embeddings do show outstanding effectiveness on capturing data similarities, making it one of the most widely used and studied techniques in the state-of-the-art similarity query processing research.
Why the Rich Get Richer? On the Balancedness of Random Partition Models
Random partition models are widely used in Bayesian methods for various clustering tasks, such as mixture models, topic models, and community detection problems.