UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World

CVPR 2020  ·  Shangbang Long, Cong Yao ·

Synthetic data has been a critical tool for training scene text detection and recognition models. On the one hand, synthetic word images have proven to be a successful substitute for real images in training scene text recognizers. On the other hand, however, scene text detectors still heavily rely on a large amount of manually annotated real-world images, which are expensive. In this paper, we introduce UnrealText, an efficient image synthesis method that renders realistic images via a 3D graphics engine. 3D synthetic engine provides realistic appearance by rendering scene and text as a whole, and allows for better text region proposals with access to precise scene information, e.g. normal and even object meshes. The comprehensive experiments verify its effectiveness on both scene text detection and recognition. We also generate a multilingual version for future research into multilingual scene text detection and recognition. Additionally, we re-annotate scene text recognition datasets in a case-sensitive way and include punctuation marks for more comprehensive evaluations. The code and the generated datasets are released at: https://github.com/Jyouhou/UnrealText/ .

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here