Egyptian Arabic Segmentation Dataset

Introduced by Samih et al. in A Neural Architecture for Dialectal Arabic Segmentation

Contains 350 tweets with more than 8,000 words including 3,000 unique words written in Egyptian dialect. The tweets have much dialectal content covering most of dialectal Egyptian phonological, morphological, and syntactic phenomena. It also includes Twitter-specific aspects of the text, such as #hashtags, @mentions, emoticons and URLs.

Source: A Neural Architecture for Dialectal Arabic Segmentation

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages