Optical Character Recognition (OCR)

314 papers with code • 5 benchmarks • 42 datasets

Optical Character Recognition or Optical Character Reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo, license plates in cars...) or from subtitle text superimposed on an image (for example: from a television broadcast)

Benchmarks

Add a Result

These leaderboards are used to track progress in Optical Character Recognition (OCR)

Dataset	Best Model	Compare
Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study	DTrOCR	See all
FSNS - Test	AttentionOCR_Inception-resnet-v2_Location	See all
I2L-140K	I2L-NOPOOL	See all
SUT	Tesseract	See all
im2latex-100k	I2L-STRIPS	See all

Libraries

Use these libraries to find Optical Character Recognition (OCR) models and implementations

PaddlePaddle/PaddleOCR

18 papers

38,684

open-mmlab/mmocr

6 papers

4,090

alibabaresearch/advancedliteratemac…

5 papers

965

Media-Smart/vedastr

5 papers

531

See all 10 libraries.

Datasets

Subtasks

Irregular Text Recognition

Handwritten Chinese Text Recognition

Offline Handwritten Chinese Character Recognition

Word Spotting In Handwritten Documents

Handwritten Digit Image Synthesis

Grapheme Detection

Most implemented papers

Most implemented Social Latest No code

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification

bgshih/aster • • good 2018

SCENE text recognition has attracted great interest from the academia and the industry in recent years owing to its importance in a wide range of applications.

Paper
Code

Stroke extraction for offline handwritten mathematical expression recognition

chungkwong/mathocr-myscript • 16 May 2019

Given a ready-made state-of-the-art online handwritten mathematical expression recognizer, the proposed procedure correctly recognized 58. 22%, 65. 65%, and 65. 22% of the offline formulas rendered from the datasets of the Competitions on Recognition of Online Handwritten Mathematical Expressions(CROHME) in 2014, 2016, and 2019 respectively.

Paper
Code

FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents

cydal/LayoutML_pytorch • • 27 May 2019

We present a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms.

Paper
Code

Multimodal deep networks for text and image-based document classification

Quicksign/ocrized-text-dataset • 15 Jul 2019

Classification of document images is a critical step for archival of old manuscripts, online subscription and administrative procedures.

Paper
Code

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

amzn/convolutional-handwriting-gan • • CVPR 2020

This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design.

Paper
Code

Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders

IVRL/w2s • • ICLR 2021

Deep Learning based methods have emerged as the indisputable leaders for virtually all image restoration tasks.

Paper
Code

PP-OCRv2: Bag of Tricks for Ultra Lightweight OCR System

PaddlePaddle/PaddleOCR • • 7 Sep 2021

Optical Character Recognition (OCR) systems have been widely used in various of application scenarios.

Paper
Code

DocScanner: Robust Document Image Rectification with Progressive Learning

fh2019ustc/DocScanner • • 28 Oct 2021

The iterative refinements make DocScanner converge to a robust and superior rectification performance, while the lightweight recurrent architecture ensures the running efficiency.

Paper
Code

DiT: Self-supervised Pre-training for Document Image Transformer

microsoft/unilm • • 4 Mar 2022

We leverage DiT as the backbone network in a variety of vision-based Document AI tasks, including document image classification, document layout analysis, table detection as well as text detection for OCR.

Paper
Code

Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification

ihsaan-ullah/meta-album • • NeurIPS 2022

We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks.

Paper
Code

Optical Character Recognition (OCR)

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result