Scene Text Detection
91 papers with code • 9 benchmarks • 15 datasets
Scene Text Detection is a computer vision task that involves automatically identifying and localizing text within natural images or videos. The goal of scene text detection is to develop algorithms that can robustly detect and and label text with bounding boxes in uncontrolled and complex environments, such as street signs, billboards, or license plates.
Source: ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection
Libraries
Use these libraries to find Scene Text Detection models and implementationsDatasets
Latest papers
Recurrent Generic Contour-based Instance Segmentation with Progressive Learning
It maintains a single estimate of the contour that is progressively deformed toward the object boundary.
CBNet: A Plug-and-Play Network for Segmentation-Based Scene Text Detection
Recently, segmentation-based methods are quite popular in scene text detection, which mainly contain two steps: text kernel segmentation and expansion.
DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting
In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.
Arbitrary Shape Text Detection via Segmentation with Probability Maps
To be concrete, we adopt a Sigmoid Alpha Function (SAF) to transfer the distances between boundaries and their inside pixels to a probability map.
DPText-DETR: Towards Better Scene Text Detection with Dynamic Points in Transformer
However, these methods built upon detection transformer framework might achieve sub-optimal training efficiency and performance due to coarse positional query modeling. In addition, the point label form exploited in previous works implies the reading order of humans, which impedes the detection robustness from our observation.
TraffSign: Multilingual Traffic Signboard Text Detection and Recognition for Urdu and English
To this end, we present Deep Learning Laboratory’s Traffic Signboards Dataset (DLL-TraffSiD) to develop multi-lingual text detection and recognition methods for traffic signboards.
Vision-Language Pre-Training for Boosting Scene Text Detectors
In this paper, we specifically adapt vision-language joint learning for scene text detection, a task that intrinsically involves cross-modal interaction between the two modalities: vision and language, since text is the written form of language.
Towards End-to-End Unified Scene Text Detection and Layout Analysis
In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.
SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition
End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
By incorporating the proposed DB and ASF with the segmentation network, our proposed scene text detector consistently achieves state-of-the-art results, in terms of both detection accuracy and speed, on five standard benchmarks.