TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Scene Text Recognition	ICDAR 2003	SAFL	Accuracy	95.0	# 4
Scene Text Recognition	ICDAR2013	SAFL	Accuracy	92.8	# 28
Scene Text Recognition	ICDAR2015	SAFL	Accuracy	77.5	# 21
Scene Text Recognition	SVT	SAFL	Accuracy	88.6	# 29

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/safl-a-self-attention-scene-text-recognizer-1/scene-text-recognition-on-icdar-2003)](https://paperswithcode.com/sota/scene-text-recognition-on-icdar-2003?p=safl-a-self-attention-scene-text-recognizer-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/safl-a-self-attention-scene-text-recognizer-1/scene-text-recognition-on-icdar2015)](https://paperswithcode.com/sota/scene-text-recognition-on-icdar2015?p=safl-a-self-attention-scene-text-recognizer-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/safl-a-self-attention-scene-text-recognizer-1/scene-text-recognition-on-icdar2013)](https://paperswithcode.com/sota/scene-text-recognition-on-icdar2013?p=safl-a-self-attention-scene-text-recognizer-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/safl-a-self-attention-scene-text-recognizer-1/scene-text-recognition-on-svt)](https://paperswithcode.com/sota/scene-text-recognition-on-svt?p=safl-a-self-attention-scene-text-recognizer-1)`

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

1 Jan 2022 · Bao Hieu Tran, Thanh Le-Cong, Huu Manh Nguyen, Duc Anh Le, Thanh Hung Nguyen, Phi Le Nguyen ·

In the last decades, scene text recognition has gained worldwide attention from both the academic community and actual users due to its importance in a wide range of applications. Despite achievements in optical character recognition, scene text recognition remains challenging due to inherent problems such as distortions or irregular layout. Most of the existing approaches mainly leverage recurrence or convolution-based neural networks. However, while recurrent neural networks (RNNs) usually suffer from slow training speed due to sequential computation and encounter problems as vanishing gradient or bottleneck, CNN endures a trade-off between complexity and performance. In this paper, we introduce SAFL, a self-attention-based neural network model with the focal loss for scene text recognition, to overcome the limitation of the existing approaches. The use of focal loss instead of negative log-likelihood helps the model focus more on low-frequency samples training. Moreover, to deal with the distortions and irregular texts, we exploit Spatial TransformerNetwork (STN) to rectify text before passing to the recognition network. We perform experiments to compare the performance of the proposed model with seven benchmarks. The numerical results show that our model achieves the best performance.

PDF Abstract

Code

Add Remove Mark official

ICMLA-SAFL/SAFL_pytorch official

Tasks

Add Remove

Optical Character Recognition

Optical Character Recognition (OCR)

Scene Text Recognition

Datasets

ICDAR 2013

ICDAR 2003

SVT

Results from the Paper

Edit

Ranked #4 on Scene Text Recognition on ICDAR 2003

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Scene Text Recognition	ICDAR 2003	SAFL	Accuracy	95.0	# 4	Compare
Scene Text Recognition	ICDAR2013	SAFL	Accuracy	92.8	# 28	Compare
Scene Text Recognition	ICDAR2015	SAFL	Accuracy	77.5	# 21	Compare
Scene Text Recognition	SVT	SAFL	Accuracy	88.6	# 29	Compare

Methods

Add Remove

Focal Loss • SPEED

Edit Social Preview

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove