A3S: Adversarial learning of semantic representations for Scene-Text Spotting

21 Feb 2023  ·  Masato Fujitake ·

Scene-text spotting is a task that predicts a text area on natural scene images and recognizes its text characters simultaneously. It has attracted much attention in recent years due to its wide applications. Existing research has mainly focused on improving text region detection, not text recognition. Thus, while detection accuracy is improved, the end-to-end accuracy is insufficient. Texts in natural scene images tend to not be a random string of characters but a meaningful string of characters, a word. Therefore, we propose adversarial learning of semantic representations for scene text spotting (A3S) to improve end-to-end accuracy, including text recognition. A3S simultaneously predicts semantic features in the detected text area instead of only performing text recognition based on existing visual features. Experimental results on publicly available datasets show that the proposed method achieves better accuracy than other methods.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Text Spotting ICDAR 2015 A3S F-measure (%) - Strong Lexicon 84.8 # 7
F-measure (%) - Weak Lexicon 83.7 # 3
F-measure (%) - Generic Lexicon 79.6 # 2
Text Spotting SCUT-CTW1500 A3S F-measure (%) - No Lexicon 64.4 # 1
F-Measure (%) - Full Lexicon 82.3 # 2
Text Spotting Total-Text A3S F-measure (%) - Full Lexicon 85.1 # 5
F-measure (%) - No Lexicon 79.4 # 4

Methods


No methods listed for this paper. Add relevant methods here