Matching web tables to DBpedia-A feature utility study

EDBT 2017  ·  Dominique Ritze, Christian Bizer ·

Relational HTML tables on the Web contain data describing a multitude of entities and covering a wide range of topics. Thus, web tables are very useful for filling missing values in cross-domain knowledge bases such as DBpedia, YAGO, or the Google Knowledge Graph. Before web table data can be used to fill missing values, the tables need to be matched to the knowledge base in question. This involves three matching tasks: table-to-class matching, rowto-instance matching, and attribute-to-property matching. Various matching approaches have been proposed for each of these tasks. Unfortunately, the existing approaches are evaluated using different web table corpora. Each individual approach also only exploits a subset of the web table and knowledge base features that are potentially helpful for the matching tasks. These two shortcomings make it difficult to compare the different matching approaches and to judge the impact of each feature on the overall matching results. This paper contributes to improve the understanding of the utility of different features for web table to knowledge base matching by reimplementing different matching techniques as well as similarity score aggregation methods from literature within a single matching framework and evaluating different combinations of these techniques against a single gold standard. The gold standard consists of class-, instance-, and property correspondences between the DBpedia knowledge base and web tables from the Web Data Commons web table corpus.

PDF Abstract

Datasets


Introduced in the Paper:

T2Dv2

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Row Annotation T2Dv2 T2K F1 (%) 80 # 1
Table Type Detection T2Dv2 T2K F1 (%) 92 # 1
Columns Property Annotation T2Dv2 T2K F1 (%) 81 # 1

Methods


No methods listed for this paper. Add relevant methods here