7 dataset results for Table Detection AND Images

ICDAR 2013

The ICDAR 2013 dataset consists of 229 training images and 233 testing images, with word-level annotations provided. It is the standard benchmark dataset for evaluating near-horizontal text detection.

229 PAPERS • 3 BENCHMARKS

FUNSD (Form Understanding in Noisy Scanned Documents)

Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task. The proposed dataset can be used for various tasks, including text detection, optical character recognition, spatial layout analysis, and entity labeling/linking.

142 PAPERS • 3 BENCHMARKS

SciTSR

SciTSR is a large-scale table structure recognition dataset, which contains 15,000 tables in PDF format and their corresponding structure labels obtained from LaTeX source files.

32 PAPERS • NO BENCHMARKS YET

PubTables-1M (PubMed Tables One Million)

The goal of PubTables-1M is to create a large, detailed, high-quality dataset for training and evaluating a wide variety of models for the tasks of table detection, table structure recognition, and functional analysis. It contains:

14 PAPERS • NO BENCHMARKS YET

IIIT-AR-13K

IIIT-AR-13K is created by manually annotating the bounding boxes of graphical or page objects in publicly available annual reports. This dataset contains a total of 13k annotated page images with objects in five different popular categories - table, figure, natural image, logo, and signature. It is the largest manually annotated dataset for graphical object detection.

6 PAPERS • NO BENCHMARKS YET

TNCR Dataset (Table Net Detection and Classification Dataset)

We present TNCR, a new table dataset with varying image quality collected from free open source websites. TNCR dataset can be used for table detection in scanned document images and their classification into 5 different classes.

2 PAPERS • NO BENCHMARKS YET

STDW

STDW is a diverse large-scale dataset for table detection with more than seven thousand samples containing a wide variety of table structures collected from many diverse sources.

1 PAPER • 1 BENCHMARK

Datasets

7 dataset results for Table Detection AND Images