Scene Text Detection

91 papers with code • 9 benchmarks • 15 datasets

Scene Text Detection is a computer vision task that involves automatically identifying and localizing text within natural images or videos. The goal of scene text detection is to develop algorithms that can robustly detect and and label text with bounding boxes in uncontrolled and complex environments, such as street signs, billboards, or license plates.

Source: ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection

Benchmarks

Add a Result

These leaderboards are used to track progress in Scene Text Detection

Dataset	Best Model	Compare
ICDAR 2015	TextFuseNet (ResNeXt-101)	See all
Total-Text	MixNet	See all
MSRA-TD500	MixNet	See all
SCUT-CTW1500	MixNet	See all
ICDAR 2013	TextFuseNet (ResNeXt-101)	See all
ICDAR 2017 MLT	PMTD*	See all
COCO-Text	Corner-based Region Proposals	See all
IC19-Art	MixNet	See all
IC19-ReCTs	BDN	See all

Libraries

Use these libraries to find Scene Text Detection models and implementations

PaddlePaddle/PaddleOCR

9 papers

38,291

mindspore-lab/mindocr

7 papers

155

open-mmlab/mmocr

6 papers

4,056

vitae-transformer/vitae-transformer…

4 papers

See all 8 libraries.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond

alibabaresearch/advancedliteratemachinery • • 19 Oct 2023

In this report, we introduce DocXChain, a powerful open-source toolchain for document parsing, which is designed and developed to automatically convert the rich information embodied in unstructured documents, such as text, tables and charts, into structured representations that are readable and manipulable by machines.

894

19 Oct 2023

Paper
Code

Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance

alloydas/testr_eval • • 2 Oct 2023

The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions.

02 Oct 2023

Paper
Code

STEP -- Towards Structured Scene-Text Spotting

cvc-dag/step • • 5 Sep 2023

We introduce the structured scene-text spotting task, which requires a scene-text OCR system to spot text in the wild according to a query regular expression.

05 Sep 2023

Paper
Code

MixNet: Toward Accurate Detection of Challenging Scene Text in the Wild

D641593/MixNet • • 23 Aug 2023

Detecting small scene text instances in the wild is particularly challenging, where the influence of irregular positions and nonideal lighting often leads to detection errors.

23 Aug 2023

Paper
Code

Turning a CLIP Model into a Scene Text Spotter

wenwenyu/tcm • • 21 Aug 2023

Utilizing only 10% of the supervised data, FastTCM-CR50 improves performance by an average of 26. 5% and 5. 5% for text detection and spotting tasks, respectively.

146

21 Aug 2023

Paper
Code

SRFormer: Text Detection Transformer with Incorporated Segmentation and Regression

opendrivelab/elm • • 21 Aug 2023

In light of this, we constrain the incorporation of segmentation branches to the first few decoder layers and employ progressive regression refinement in subsequent layers, achieving performance gains while minimizing computational load from the mask. Furthermore, we propose a Mask-informed Query Enhancement module.

21 Aug 2023

Paper
Code

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

ychensu/lranet • • 27 Jun 2023

Next, we propose a dual assignment scheme for speed acceleration.

27 Jun 2023

Paper
Code

ViTEraser: Harnessing the Power of Vision Transformers for Scene Text Removal with SegMIM Pretraining

shannanyinxiang/viteraser • 21 Jun 2023

As ViTEraser implicitly integrates text localization and inpainting, we propose a novel end-to-end pretraining method, termed SegMIM, which focuses the encoder and decoder on the text box segmentation and masked image modeling tasks, respectively.

21 Jun 2023

Paper
Code

DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting

vitae-transformer/deepsolo • • 31 May 2023

In this paper, we present DeepSolo++, a simple DETR-like baseline that lets a single decoder with explicit points solo for text detection, recognition, and script identification simultaneously.

222

31 May 2023

Paper
Code

Turning a CLIP Model into a Scene Text Detector

wenwenyu/tcm • • CVPR 2023

Recently, pretraining approaches based on vision language models have made effective progresses in the field of text detection.

146

28 Feb 2023

Paper
Code

Scene Text Detection

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result