Scene Text Detection is a task to detect text regions in the complex background and label them with bounding boxes.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Over the past few years, the field of scene text detection has progressed rapidly that modern text detectors are able to hunt text in various challenging scenarios.
Moreover, a paired low-quality scene text video dataset named Text-RBL is proposed, consisting of raw videos, blurry videos, and low-resolution videos, labeled by the proposed convenient semi-automatic labeling strategy.
The framework utilizes Conceptual Text Regions (CTRs), which is a class of cognition-based tools inheriting good mathematical properties, allowing for sophisticated label design.
Detection and recognition of scene texts of arbitrary shapes remain a grand challenge due to the super-rich text shape variation in text line orientations, lengths, curvatures, etc.
Natural scene text detection is an important aspect of scene understanding and could be a useful tool in building engaging augmented reality applications.
With only bounding-box annotations in the spatial domain, existing video scene text detection (VSTD) benchmarks lack temporal relation of text instances among video frames, which hinders the development of video text-related applications.
The advantages of our method are as follows: (1) overcomes the color confusion problem between text and background caused by dirty data; (2) the texts generated are allowed to appear in most locations of any image, even across depths; (3) avoids analyzing the depth of background, such that the performance of our method exceeds the state-of-the-art methods; (4) the speed of generating images is fast, nearly one picture generated per three milliseconds.