Text Spotting Transformers

In this paper, we present TExt Spotting TRansformers (TESTR), a generic end-to-end text spotting framework using Transformers for text detection and recognition in the wild. TESTR builds upon a single encoder and dual decoders for the joint text-box control point regression and character recognition. Other than most existing literature, our method is free from Region-of-Interest operations and heuristics-driven post-processing procedures; TESTR is particularly effective when dealing with curved text-boxes where special cares are needed for the adaptation of the traditional bounding-box representations. We show our canonical representation of control points suitable for text instances in both Bezier curve and polygon annotations. In addition, we design a bounding-box guided polygon detection (box-to-polygon) process. Experiments on curved and arbitrarily shaped datasets demonstrate state-of-the-art performances of the proposed TESTR algorithm.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Spotting ICDAR 2015 TESTR F-measure (%) - Strong Lexicon 85.2 # 6
F-measure (%) - Weak Lexicon 79.4 # 8
F-measure (%) - Generic Lexicon 73.6 # 9
Text Spotting Inverse-Text TESTR F-measure (%) - No Lexicon 34.2 # 8
F-measure (%) - Full Lexicon 41.6 # 8
Text Spotting SCUT-CTW1500 TESTR F-measure (%) - No Lexicon 56.0 # 9
F-Measure (%) - Full Lexicon 81.5 # 3
Text Spotting Total-Text TESTR F-measure (%) - Full Lexicon 83.9 # 7
F-measure (%) - No Lexicon 73.3 # 9

Methods


No methods listed for this paper. Add relevant methods here