RuFa (Ruqaa-Farsi) dataset contains images of text written in one of two Arabic fonts: Ruqaa and Nastaliq (Farsi). The dataset contains 40,000 synthesized image and 516 real one, 40,516 in total. Images are in RGB JPG format at 100×100px. Text in the images has varying number of words, position, size, and opacity.
Real images were extracted from:
“The Rules of Arabic Calligraphy” by Hashem Al-Khatat - 1986.
“Ottman Fonts” by Muhammad Amin Osmanli Ketbkhana.
The synthetization process is described in detail in this post.
Dataset folder structure:
/rufa (40,516 images)
/real (516 images)
* /ruqaa (260 images)
* /farsi (256 images)
/synth (40,000 images)
* /ruqaa (20,000 images)
* /farsi (20,000 images)
Paper | Code | Results | Date | Stars |
---|