Optical Character Recognition (OCR)
307 papers with code • 5 benchmarks • 42 datasets
Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)
Libraries
Use these libraries to find Optical Character Recognition (OCR) models and implementationsSubtasks
Most implemented papers
Chinese Text in the Wild
[python3. 6] 运用tf实现自然场景文字检测, keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别
Exploring Cross-Image Pixel Contrast for Semantic Segmentation
Inspired by the recent advance in unsupervised contrastive representation learning, we propose a pixel-wise contrastive framework for semantic segmentation in the fully supervised setting.
COCO-Text: Dataset and Benchmark for Text Detection and Recognition in Natural Images
The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images.
End-to-End Interpretation of the French Street Name Signs Dataset
We introduce the French Street Name Signs (FSNS) Dataset consisting of more than a million images of street name signs cropped from Google Street View images of France.
OCR-free Document Understanding Transformer
Current Visual Document Understanding (VDU) methods outsource the task of reading text to off-the-shelf Optical Character Recognition (OCR) engines and focus on the understanding task with the OCR outputs.
Attention-based Extraction of Structured Information from Street View Imagery
We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84. 2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith'16), which achieved 72. 46%.
STN-OCR: A single Neural Network for Text Detection and Text Recognition
In contrast to most existing works that consist of multiple deep neural networks and several pre-processing steps we propose to use a single deep neural network that learns to detect and recognize text from natural images in a semi-supervised way.
E2E-MLT - an Unconstrained End-to-End Method for Multi-Language Scene Text
An end-to-end trainable (fully differentiable) method for multi-language scene text localization and recognition is proposed.
NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition
Considering scene image has large variation in text and background, we further design a modality-transform block to effectively transform 2D input images to 1D sequences, combined with the encoder to extract more discriminative features.
ASTER: An Attentional Scene Text Recognizer with Flexible Rectification
SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications.