🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task (clear)

Filter by Language

15 dataset results for Scene Text Detection

ICDAR 2013

The ICDAR 2013 dataset consists of 229 training images and 233 testing images, with word-level annotations provided. It is the standard benchmark dataset for evaluating near-horizontal text detection.

230 PAPERS • 3 BENCHMARKS

Total-Text

Total-Text is a text detection dataset that consists of 1,555 images with a variety of text types including horizontal, multi-oriented, and curved text instances. The training split and testing split have 1,255 images and 300 images, respectively.

143 PAPERS • 2 BENCHMARKS

MSRA-TD500 (MSRA Text Detection 500 Database)

The MSRA-TD500 dataset is a text detection dataset that contains 300 training images and 200 test images. Text regions are arbitrarily orientated and annotated at sentence level. Different from the other datasets, it contains both English and Chinese text.

119 PAPERS • 1 BENCHMARK

COCO-Text

The COCO-Text dataset is a dataset for text detection and recognition. It is based on the MS COCO dataset, which contains images of complex everyday scenes. The COCO-Text dataset contains non-text images, legible text images and illegible text images. In total there are 22184 training images and 7026 validation images with at least one instance of legible text.

80 PAPERS • 2 BENCHMARKS

ICDAR 2015

ICDAR 2015 was a scene text detection used for the ICDAR 2015 conference.

51 PAPERS • 2 BENCHMARKS

SCUT-CTW1500

The SCUT-CTW1500 dataset contains 1,500 images: 1,000 for training and 500 for testing. In particular, it provides 10,751 cropped text instance images, including 3,530 with curved text. The images are manually harvested from the Internet, image libraries such as Google Open-Image, or phone cameras. The dataset contains a lot of horizontal and multi-oriented text.

41 PAPERS • 3 BENCHMARKS

TextOCR

TextOCR is a dataset to benchmark text recognition on arbitrary shaped scene-text. TextOCR requires models to perform text-recognition on arbitrary shaped scene-text present on natural images. TextOCR provides ~1M high quality word annotations on TextVQA images allowing application of end-to-end reasoning on downstream tasks such as visual question answering or image captioning.

21 PAPERS • NO BENCHMARKS YET

RCTW-17 (Reading Chinese Text in the Wild)

Features a large-scale dataset with 12,263 annotated images. Two tasks, namely text localization and end-to-end recognition, are set up. The competition took place from January 20 to May 31, 2017. 23 valid submissions were received from 19 teams.

19 PAPERS • NO BENCHMARKS YET

ICDAR 2017

ICDAR2017 is a dataset for scene text detection.

18 PAPERS • 1 BENCHMARK

Chinese Text in the Wild

Chinese Text in the Wild is a dataset of Chinese text with about 1 million Chinese characters from 3850 unique ones annotated by experts in over 30000 street view images. This is a challenging dataset with good diversity containing planar text, raised text, text under poor illumination, distant text, partially occluded text, etc.

5 PAPERS • NO BENCHMARKS YET

PKU (License Plate Detection)

The PKU dataset has almost 4,000 images categorized into five groups (G1-G5) that show different situations. For example, G1 has images of highways during the day with only one car in them. On the other hand, G5 has images of crosswalks during the day or at night with multiple cars and license plates (LPs).

2 PAPERS • NO BENCHMARKS YET

ShopSign

A newly developed natural scene text dataset of Chinese shop signs in street views.

2 PAPERS • NO BENCHMARKS YET

UrduDoc

The UrduDoc Dataset is a benchmark dataset for Urdu text line detection in scanned documents. It is created as a byproduct of the UTRSet-Real dataset generation process. Comprising 478 diverse images collected from various sources such as books, documents, manuscripts, and newspapers, it offers a valuable resource for research in Urdu document analysis. It includes 358 pages for training and 120 pages for validation, featuring a wide range of styles, scales, and lighting conditions. It serves as a benchmark for evaluating printed Urdu text detection models, and the benchmark results of state-of-the-art models are provided. The Contour-Net model demonstrates the best performance in terms of h-mean.

1 PAPER • 1 BENCHMARK

CNTD (Chinese and Naxi text detection)

Chinese and Naxi scene text detection data set, labelme to json.

0 PAPER • NO BENCHMARKS YET

Indian Number Plates Dataset | Vehicle Number Plates | English OCR Detection

This dataset is an extremely challenging set of over 20,000+ original Number plate images captured and crowdsourced from over 700+ urban and rural areas, where each image is manually reviewed and verified by computer vision professionals at Datacluster Labs

0 PAPER • NO BENCHMARKS YET

Datasets

15 dataset results for Scene Text Detection