105,941 Images Natural Scenes OCR Data of 12 Languages

Description: 105,941 Images Natural Scenes OCR Data of 12 Languages. The data covers 12 languages (6 Asian languages, 6 European languages), multiple natural scenes, multiple photographic angles. For annotation, line-level quadrilateral bounding box annotation and transcription for the texts were annotated in the data. The data can be used for tasks such as OCR of multi-language.

Data size: 105,941 images, including Asian language family: Japanese 9,997 images, Korean 10,231 images, Indonesian 7,591 images, Malay 5,650 images, Vietnamese 8,822 images, Thai 9,645 images; European language family: French 10,015 images, German 7,213 images, Italian 8,824 images, Portuguese 7,754 images, Russian 10,376 images and Spanish 9,823 images

Collecting environment: including shop plaque, stop board, poster, ticket, road sign, comic, cover picture, prompt/reminder, warning, packing instruction, menu, building sign, etc.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


Modalities


Languages