DAGOBAH: Table and Graph Contexts for Efficient Semantic Annotation of Tabular Data
In this paper, we present the latest improvements of the DAGOBAH system that performs automatic pre-processing and semantic interpretation of tables. In particular, we report promising results obtained in the SemTab 2021 challenge thanks to optimisations in lookup mechanisms and new techniques for studying the context of nodes in the target knowledge graph. We also present the deployment of DAGOBAH algorithms within the Orange company via the TableAnnotation API and a front-end DAGOBAH user interface. These two access methods enable to accelerate the adoption of Semantic Table Interpretation solutions within the company to meet industrial needs.
PDF AbstractDatasets
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Cell Entity Annotation | BiodivTab | DAGOBAH | F1 (%) | 62 | # 3 | |
Column Type Annotation | BiodivTab | DAGOBAH | F1 (%) | 34.4 | # 4 | |
Column Type Annotation | GitTables-SemTab-DBP | DAGOBAH | F1 (%) | 7.00 | # 2 | |
Column Type Annotation | GitTables-SemTab-SCH | DAGOBAH | F1 (%) | 18.3 | # 2 | |
Column Type Annotation | ToughTables-DBP | DAGOBAH | F1 (%) | 42.2 | # 3 | |
Cell Entity Annotation | ToughTables-DBP | DAGOBAH | F1 (%) | 94.5 | # 1 | |
Cell Entity Annotation | ToughTables-WD | DAGOBAH | F1 (%) | 92.3 | # 3 | |
Column Type Annotation | ToughTables-WD | DAGOBAH | F1 (%) | 83.2 | # 1 |