Automatic acquisition of Urdu nouns (along with gender and irregular plurals)

LREC 2014  ·  Tafseer Ahmed Khan ·

The paper describes a set of methods to automatically acquire the Urdu nouns (and its gender) on the basis of inflectional and contextual clues. The algorithms used are a blend of computer{'}s brute force on the corpus and careful design of distinguishing rules on the basis linguistic knowledge. As there are homograph inflections for Urdu nouns, adjectives and verbs, we compare potential inflectional forms with paradigms of inflections in strict order and gives best guess (of part of speech) for the word. We also worked on irregular plurals i.e. the plural forms that are borrowed from Arabic, Persian and English. Evaluation shows that not all the borrowed rules have same productivity in Urdu. The commonly used borrowed plural rules are shown in the result.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here