TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Entity Resolution	Abt-Buy	Random Forest	F1 (%)	85.00	# 6
Entity Resolution	Amazon-Google	Random Forest	F1 (%)	79.0	# 4
Entity Resolution	WDC Computers-xlarge	Random Forest	F1 (%)	78.0	# 6

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/profiling-entity-matching-benchmark-tasks/entity-resolution-on-amazon-google)](https://paperswithcode.com/sota/entity-resolution-on-amazon-google?p=profiling-entity-matching-benchmark-tasks)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/profiling-entity-matching-benchmark-tasks/entity-resolution-on-abt-buy)](https://paperswithcode.com/sota/entity-resolution-on-abt-buy?p=profiling-entity-matching-benchmark-tasks)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/profiling-entity-matching-benchmark-tasks/entity-resolution-on-wdc-computers-xlarge)](https://paperswithcode.com/sota/entity-resolution-on-wdc-computers-xlarge?p=profiling-entity-matching-benchmark-tasks)`

Profiling Entity Matching Benchmark Tasks

International Conference on Information & Knowledge Management 2020 · Anna Primpeli, Christian Bizer ·

Entity matching is a central task in data integration which has been researched for decades. Over this time, a wide range of benchmark tasks for evaluating entity matching methods has been developed. This resource paper systematically complements, profiles, and compares 21 entity matching benchmark tasks. In order to better understand the specific challenges associated with different tasks, we define a set of profiling dimensions which capture central aspects of the matching tasks. Using these dimensions, we create groups of benchmark tasks having similar characteristics. Afterwards, we assess the difficulty of the tasks in each group by computing baseline evaluation results using standard feature engineering together with two common classification methods. In order to enable the exact reproducibility of evaluation results, matching tasks need to contain exactly defined sets of matching and non-matching record pairs, as well as a fixed development and test split. As this is not the case for some widely-used benchmark tasks, we complement these tasks with fixed sets of non-matching pairs, as well as fixed splits, and provide the resulting development and test sets for public download. By profiling and complementing the benchmark tasks, we support researchers to select challenging as well as diverse tasks and to compare matching systems on clearly defined grounds.

PDF Abstract

Code

Add Remove Mark official

wbsg-uni-mannheim/EntityMatchingTas…

Tasks

Add Remove

Data Integration

Entity Resolution

Feature Engineering

Datasets

Amazon-Google Abt-Buy WDC LSPM

Results from the Paper

Add Remove

Ranked #4 on Entity Resolution on Amazon-Google

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Entity Resolution	Abt-Buy	Random Forest	F1 (%)	85.00	# 6	Compare
Entity Resolution	Amazon-Google	Random Forest	F1 (%)	79.0	# 4	Compare
Entity Resolution	WDC Computers-xlarge	Random Forest	F1 (%)	78.0	# 6	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Profiling Entity Matching Benchmark Tasks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove