We present a neural network model - based on CNNs, RNNs and a novel attention mechanism - which achieves 84. 2% accuracy on the challenging French Street Name Signs (FSNS) dataset, significantly outperforming the previous state of the art (Smith'16), which achieved 72. 46%.
Ranked #1 on
Optical Character Recognition
on FSNS - Test
We describe efforts to adapt the Tesseract open source OCR engine for multiple scripts and languages.
Meanwhile, several pre-trained models for the Chinese and English recognition are released, including a text detector (97K images are used), a direction classifier (600K images are used) as well as a text recognizer (17. 9M images are used).
We introduce the French Street Name Signs (FSNS) Dataset consisting of more than a million images of street name signs cropped from Google Street View images of France.
Ranked #3 on
Optical Character Recognition
on FSNS - Test
The goal of COCO-Text is to advance state-of-the-art in text detection and recognition in natural images.
OBJECT RECOGNITION OPTICAL CHARACTER RECOGNITION SCENE TEXT SCENE TEXT DETECTION SCENE UNDERSTANDING
We empirically demonstrate that the proposed approach achieves competitive performance on various challenging semantic segmentation benchmarks: Cityscapes, ADE20K, LIP, PASCAL-Context, and COCO-Stuff.
Ranked #2 on
Semantic Segmentation
on LIP val
Despite the large number of both commercial and academic methods for Automatic License Plate Recognition (ALPR), most existing approaches are focused on a specific license plate (LP) region (e. g. European, US, Brazilian, Taiwanese, etc.
Ranked #2 on
License Plate Recognition
on AOLP-RP
We present a neural encoder-decoder model to convert images into presentational markup based on a scalable coarse-to-fine attention mechanism.
Thus, we propose a lightweight scene text recognition model named Hamming OCR.
OPTICAL CHARACTER RECOGNITION SCENE TEXT SCENE TEXT RECOGNITION