Entity Resolution with Hierarchical Graph Attention Networks
Entity Resolution (ER) links entities that refer to the same real-world entity from different sources. Existing work usually takes pairs of entities as input and judges those pairs independently. However, there is often interdependence between different pairs of ER decisions, e.g., the entities from the same data source are usually semantically related to each other. Furthermore, current ER approaches are mainly based on attribute similarity comparison, but ignore interdependence between attributes. To address the limits of existing methods, we propose HierGAT, a new method for ER based on a Hierarchical Graph Attention Transformer Network, which can model and exploit the interdependence between different ER decisions. The benefit of our method comes from: 1) The graph attention network model for joint ER decisions; 2) The graph-attention capability to identify the discriminative words from attributes and find the most discriminative attributes. Furthermore, we propose to learn contextual embeddings to enrich word embeddings for better performance. The experimental results on publicly available benchmark datasets show that HierGAT outperforms DeepMatcher by up to 32.5% of F1 score and up to 8.7% of F1 score compared with Ditto.
PDF AbstractDatasets
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Entity Resolution | Abt-Buy | HG | F1 (%) | 89.8 | # 6 | |
Entity Resolution | Amazon-Google | HG | F1 (%) | 76.4 | # 6 | |
Entity Resolution | WDC Computers-small | HG | F1 (%) | 88.50 | # 3 | |
Entity Resolution | WDC Computers-xlarge | HG | F1 (%) | 96.50 | # 4 | |
Entity Resolution | WDC Watches-small | HG | F1 (%) | 94 | # 1 | |
Entity Resolution | WDC Watches-xlarge | HG | F1 (%) | 96.50 | # 3 |