Text Spotting

52 papers with code • 4 benchmarks • 6 datasets

Text Spotting is the combination of Scene Text Detection and Scene Text Recognition in an end-to-end manner. It is the ability to read natural text in the wild.

Benchmarks

Add a Result

These leaderboards are used to track progress in Text Spotting

Dataset	Best Model	Compare
ICDAR 2015	UNITS	See all
Total-Text	DeepSolo (ViTAEv2-S, TextOCR)	See all
SCUT-CTW1500	A3S	See all
Inverse-Text	DeepSolo (ViTAEv2-S, TextOCR)	See all

Libraries

Use these libraries to find Text Spotting models and implementations

hikopensource/davar-lab-ocr

4 papers

706

mxin262/swintextspotter

3 papers

253

vitae-transformer/vitae-transformer…

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Open Images V5 Text Annotation and Yet Another Mask Text Spotter

openvinotoolkit/training_extensions • • 23 Jun 2021

A large scale human-labeled dataset plays an important role in creating high quality deep learning models.

Paper
Code

SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition

mxin262/swintextspotter • • CVPR 2022

End-to-end scene text spotting has attracted great attention in recent years due to the success of excavating the intrinsic synergy of the scene text detection and recognition.

Paper
Code

GLASS: Global to Local Attention for Scene-Text Spotting

amazon-research/glass-text-spotting • • 5 Aug 2022

In recent years, the dominant paradigm for text spotting is to combine the tasks of text detection and recognition into a single end-to-end framework.

Paper
Code

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

vitae-transformer/deepsolo • • CVPR 2023

In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

Paper
Code

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

vitae-transformer/deepsolo • • 31 May 2023

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

Paper
Code

Bridging the Gap Between End-to-End and Two-Step Text Spotting

mxin262/bridging-text-spotting • • 6 Apr 2024

Subsequently, we introduce a Bridge that connects the locked detector and recognizer through a zero-initialized neural network.

Paper
Code

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

lvpengyuan/masktextspotter.caffe2 • • ECCV 2018

Recently, models based on deep neural networks have dominated the fields of scene text detection and recognition.

Paper
Code

Visual Semantic Re-ranker for Text Spotting

ahmedssabir/Visual-Semantic-Relatedness-with-Word-Embedding • • 23 Oct 2018

In this paper, we propose a post-processing approach to improve the accuracy of text spotting by using the semantic relation between the text and the scene.

Paper
Code

You Only Recognize Once: Towards Fast Video Text Spotting

hikopensource/davar-lab-ocr • • 8 Mar 2019

Video text spotting is still an important research topic due to its various real-applications.

Paper
Code

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

MhLiao/MaskTextSpotter • • ECCV 2018

Moreover, we further investigate the recognition module of our method separately, which significantly outperforms state-of-the-art methods on both regular and irregular text datasets for scene text recognition.

Paper
Code

Text Spotting

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result