From Heuristics to Language Models: A Journey Through the Universe of Semantic Table Interpretation with DAGOBAH

This paper presents DAGOBAH SL 2022, a semantic table interpretation system that has been continuously improved over the last four years when participating in the SemTab challenge. This year, we have improved the lookup coverage using external resources and we have integrated language models for better understanding the table headers. We have also implemented various system optimizations that lead to a reduction in execution time of about 30%. In this paper, we also show the relevance of using deep learning-based approaches for resolving certain ambiguities and we discuss the limitations of existing corpora and systems for maturing further this research field.

PDF

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Column Type Annotation ToughTables-WD DAGOBAH F1 (%) 40.9 # 5
Cell Entity Annotation ToughTables-WD DAGOBAH F1 (%) 94.5 # 1

Methods


No methods listed for this paper. Add relevant methods here