In this paper, we present the latest improvements of the DAGOBAH system that performs automatic pre-processing and semantic interpretation of tables. In particular, we report promising results obtained in the SemTab 2021 challenge thanks to optimisations in lookup mechanisms and new techniques for studying the context of nodes in the target knowledge graph. We also present the deployment of DAGOBAH algorithms within the Orange company via the TableAnnotation API and a front-end DAGOBAH user interface. These two access methods enable to accelerate the adoption of Semantic Table Interpretation solutions within the company to meet industrial needs.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Cell Entity Annotation BiodivTab DAGOBAH F1 (%) 62 # 3
Column Type Annotation BiodivTab DAGOBAH F1 (%) 34.4 # 4
Column Type Annotation GitTables-SemTab-DBP DAGOBAH F1 (%) 7.00 # 2
Column Type Annotation GitTables-SemTab-SCH DAGOBAH F1 (%) 18.3 # 2
Column Type Annotation ToughTables-DBP DAGOBAH F1 (%) 42.2 # 3
Cell Entity Annotation ToughTables-DBP DAGOBAH F1 (%) 94.5 # 1
Cell Entity Annotation ToughTables-WD DAGOBAH F1 (%) 92.3 # 3
Column Type Annotation ToughTables-WD DAGOBAH F1 (%) 83.2 # 1

Methods


No methods listed for this paper. Add relevant methods here