Total-Text is a text detection dataset that consists of 1,555 images with a variety of text types including horizontal, multi-oriented, and curved text instances. The training split and testing split have 1,255 images and 300 images, respectively.
143 PAPERS • 2 BENCHMARKS
ICDAR 2015 was a scene text detection used for the ICDAR 2015 conference.
51 PAPERS • 2 BENCHMARKS
The SCUT-CTW1500 dataset contains 1,500 images: 1,000 for training and 500 for testing. In particular, it provides 10,751 cropped text instance images, including 3,530 with curved text. The images are manually harvested from the Internet, image libraries such as Google Open-Image, or phone cameras. The dataset contains a lot of horizontal and multi-oriented text.
41 PAPERS • 3 BENCHMARKS
LSVTD is a large scale video text dataset for promoting the video text spotting community, which contains 100 text videos from 22 different real-life scenarios. LSVTD covers a wide range of 13 indoor (eg. bookstore, shopping mall) and 9 outdoor scenarios, which is more than 3 times the diversity of IC15.
7 PAPERS • NO BENCHMARKS YET
40,764 images (11,659 protest images and hard negatives) with various annotations of visual attributes and sentiments.
2 PAPERS • NO BENCHMARKS YET
Extends the COCO-text [Veit et al. 2016] with information about the scene (such as objects and places appearing in the image) to enable researchers to include semantic relations between texts and scene in their Text Spotting systems, and to offer a common framework for such approaches.
1 PAPER • NO BENCHMARKS YET