Entity Resolution with Hierarchical Graph Attention Networks

Entity Resolution (ER) links entities that refer to the same real-world entity from different sources. Existing work usually takes pairs of entities as input and judges those pairs independently. However, there is often interdependence between different pairs of ER decisions, e.g., the entities from the same data source are usually semantically related to each other. Furthermore, current ER approaches are mainly based on attribute similarity comparison, but ignore interdependence between attributes. To address the limits of existing methods, we propose HierGAT, a new method for ER based on a Hierarchical Graph Attention Transformer Network, which can model and exploit the interdependence between different ER decisions. The benefit of our method comes from: 1) The graph attention network model for joint ER decisions; 2) The graph-attention capability to identify the discriminative words from attributes and find the most discriminative attributes. Furthermore, we propose to learn contextual embeddings to enrich word embeddings for better performance. The experimental results on publicly available benchmark datasets show that HierGAT outperforms DeepMatcher by up to 32.5% of F1 score and up to 8.7% of F1 score compared with Ditto.

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Entity Resolution Abt-Buy HG F1 (%) 89.8 # 4
Entity Resolution Amazon-Google HG F1 (%) 76.4 # 5
Entity Resolution WDC Computers-small HG F1 (%) 88.50 # 3
Entity Resolution WDC Computers-xlarge HG F1 (%) 96.50 # 4
Entity Resolution WDC Watches-small HG F1 (%) 94 # 1
Entity Resolution WDC Watches-xlarge HG F1 (%) 96.50 # 3