ANETAC (Arabic Named Entity Transliteration and Classification)

Introduced by Ameur et al. in ANETAC: Arabic Named Entity Transliteration and Classification Dataset

An English-Arabic named entity transliteration and classification dataset built from freely available parallel translation corpora. The dataset contains 79,924 instances, each instance is a triplet (e, a, c), where e is the English named entity, a is its Arabic transliteration and c is its class that can be either a Person, a Location, or an Organization. The ANETAC dataset is mainly aimed for the researchers that are working on Arabic named entity transliteration, but it can also be used for named entity classification purposes.

Source: ANETAC: Arabic Named Entity Transliteration and Classification Dataset

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages