Arabic Text Diacritization
7 papers with code • 2 benchmarks • 3 datasets
Addition of diacritics for undiacritized arabic texts for words disambiguation.
Most implemented papers
Arabic Text Diacritization Using Deep Neural Networks
After constructing the dataset, existing tools and systems are tested on it.
Neural Arabic Text Diacritization: State of the Art Results and a Novel Approach for Machine Translation
In this work, we present several deep learning models for the automatic diacritization of Arabic text.
Multi-components System for Automatic Arabic Diacritization
In this paper, we propose an approach to tackle the problem of the automatic restoration of Arabic diacritics that includes three components stacked in a pipeline: a deep learning model which is a multi-layer recurrent neural network with LSTM and Dense layers, a character-level rule-based corrector which applies deterministic operations to prevent some errors, and a word-level statistical corrector which uses the context and the distance information to fix some diacritization issues.
CAMeL Tools: An Open Source Python Toolkit for Arabic Natural Language Processing
We present CAMeL Tools, a collection of open-source tools for Arabic natural language processing in Python.
Deep Diacritization: Efficient Hierarchical Recurrence for Improved Arabic Diacritization
We propose a novel architecture for labelling character sequences that achieves state-of-the-art results on the Tashkeela Arabic diacritization benchmark.
Effective Deep Learning Models for Automatic Diacritization of Arabic Text
We propose three deep learning models to recover Arabic text diacritics based on our work in a text-to-speech synthesis system using deep learning.
CATT: Character-based Arabic Tashkeel Transformer
Then, we applied the Noisy-Student approach to boost the performance of the best model.