TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Scene Text Recognition	ICDAR 2003	CRNN	Accuracy	89.4	# 12
Scene Text Recognition	ICDAR2013	CRNN	Accuracy	86.7	# 36
Scene Text Recognition	SVT	CRNN	Accuracy	80.8	# 34

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-end-to-end-trainable-neural-network-for/scene-text-recognition-on-icdar-2003)](https://paperswithcode.com/sota/scene-text-recognition-on-icdar-2003?p=an-end-to-end-trainable-neural-network-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-end-to-end-trainable-neural-network-for/scene-text-recognition-on-svt)](https://paperswithcode.com/sota/scene-text-recognition-on-svt?p=an-end-to-end-trainable-neural-network-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/an-end-to-end-trainable-neural-network-for/scene-text-recognition-on-icdar2013)](https://paperswithcode.com/sota/scene-text-recognition-on-icdar2013?p=an-end-to-end-trainable-neural-network-for)`

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

21 Jul 2015 · Baoguang Shi, Xiang Bai, Cong Yao ·

Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. Compared with previous systems for scene text recognition, the proposed architecture possesses four distinctive properties: (1) It is end-to-end trainable, in contrast to most of the existing algorithms whose components are separately trained and tuned. (2) It naturally handles sequences in arbitrary lengths, involving no character segmentation or horizontal scale normalization. (3) It is not confined to any predefined lexicon and achieves remarkable performances in both lexicon-free and lexicon-based scene text recognition tasks. (4) It generates an effective yet much smaller model, which is more practical for real-world application scenarios. The experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets, demonstrate the superiority of the proposed algorithm over the prior arts. Moreover, the proposed algorithm performs well in the task of image-based music score recognition, which evidently verifies the generality of it.

PDF Abstract

Code

Add Remove Mark official

PaddlePaddle/PaddleOCR

38,418

JaidedAI/EasyOCR

↳ Quickstart in

Spaces

21,930

mindee/doctr

↳ Quickstart in

Colab

Spaces

3,032

meijieru/crnn.pytorch

2,334

bgshih/crnn

2,040

See all 83 implementations

Tasks

Add Remove

Optical Character Recognition (OCR)

Scene Text Recognition

Datasets

ICDAR 2013

ICDAR 2003

SVT

Results from the Paper

Edit

Ranked #12 on Scene Text Recognition on ICDAR 2003

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Scene Text Recognition	ICDAR 2003	CRNN	Accuracy	89.4	# 12	Compare
Scene Text Recognition	ICDAR2013	CRNN	Accuracy	86.7	# 36	Compare
Scene Text Recognition	SVT	CRNN	Accuracy	80.8	# 34	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove