Scene Text Recognition
121 papers with code • 15 benchmarks • 27 datasets
See Scene Text Detection for leaderboards in this task.
Libraries
Use these libraries to find Scene Text Recognition models and implementationsMost implemented papers
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
Detecting and recognizing text in natural scene images is a challenging, yet not completely solved task.
Pedestrian-Synthesis-GAN: Generating Pedestrian Data in Real Scene and Beyond
The results show that our framework can smoothly synthesize pedestrians on background images of variations and different levels of details.
On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
Scene text recognition (STR) is the task of recognizing character sequences in natural scenes.
A Feasible Framework for Arbitrary-Shaped Scene Text Recognition
Deep learning based methods have achieved surprising progress in Scene Text Recognition (STR), one of classic problems in computer vision.
SCATTER: Selective Context Attentional Scene Text Recognizer
The first attention step re-weights visual features from a CNN backbone together with contextual features computed by a BiLSTM layer.
Towards Accurate Scene Text Recognition with Semantic Reasoning Networks
Scene text image contains two levels of contents: visual texture and semantic information.
SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition
Scene text recognition is a hot research topic in computer vision.
Arabic Scene Text Recognition in the Deep Learning Era: Analysis on A Novel Dataset
Therefore, we use our new dataset to evaluate the problem of Arabic scene text recognition from three perspectives: (1) using deep learning techniques and studying their suitability for Arabic scene text recognition, where we identify essential components required for the model to obtain good performance; (2) identifying Arabic text challenges that differ from Latin text and require special attention; (3) investigating a bilingual model that concurrently deals with Arabic and English words, since Arabic text is usually found along with other languages.
Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition
To tackle this issue, in this paper, we propose a dual parallel attention network (DPAN), in which a newly designed parallel context attention module (PCAM) is cascaded with the original PPAM, using linguistic contextual information to compensate for the information inconsistency between queries and keys.
TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models
Text recognition is a long-standing research problem for document digitalization.