RuFa

RuFa (Ruqaa-Farsi) dataset contains images of text written in one of two Arabic fonts: Ruqaa and Nastaliq (Farsi). The dataset contains 40,000 synthesized image and 516 real one, 40,516 in total. Images are in RGB JPG format at 100×100px. Text in the images has varying number of words, position, size, and opacity.

Real images were extracted from:

  1. “The Rules of Arabic Calligraphy” by Hashem Al-Khatat - 1986.

  2. “Ottman Fonts” by Muhammad Amin Osmanli Ketbkhana.

The synthetization process is described in detail in this post.

Dataset folder structure:

/rufa (40,516 images)

  • /real (516 images)

    * /ruqaa (260 images)
    
    * /farsi   (256 images)
    
  • /synth (40,000 images)

    * /ruqaa (20,000 images)
    
    * /farsi   (20,000 images)
    

Papers


Paper Code Results Date Stars

Tasks


License


  • Unknown

Modalities


Languages