1 code implementation • 8 Apr 2020 • Hamza Abbad, Shengwu Xiong
In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues.
Ranked #5 on Arabic Text Diacritization on Tashkeela