Flexible Table Recognition and Semantic Interpretation System

25 May 2021  ·  Marcin Namysl, Alexander M. Esser, Sven Behnke, Joachim köhler ·

Table extraction is an important but still unsolved problem. In this paper, we introduce a flexible and modular table extraction system. We develop two rule-based algorithms that perform the complete table recognition process, including table detection and segmentation, and support the most frequent table formats. Moreover, to incorporate the extraction of semantic information, we develop a graph-based table interpretation method. We conduct extensive experiments on the challenging table recognition benchmarks ICDAR 2013 and ICDAR 2019, achieving results competitive with state-of-the-art approaches. Our complete information extraction system exhibited a high F1 score of 0.7380. To support future research on information extraction from documents, we make the resources (ground-truth annotations, evaluation scripts, algorithm parameters) from our table interpretation experiment publicly available.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here