Explicit Fine grained Syntactic and Semantic Annotation of the Idafa Construction in Arabic

Idafa in traditional Arabic grammar is an umbrella construction that covers several phenomena including what is expressed in English as noun-noun compounds and Saxon and Norman genitives. Additionally, Idafa participates in some other constructions, such as quantifiers, quasi-prepositions, and adjectives. Identifying the various types of the Idafa construction (IC) is of importance to Natural Language processing (NLP) applications. Noun-Noun compounds exhibit special behavior in most languages impacting their semantic interpretation. Hence distinguishing them could have an impact on downstream NLP applications. The most comprehensive syntactic representation of the Arabic language is the LDC Arabic Treebank (ATB). In the ATB, ICs are not explicitly labeled and furthermore, there is no distinction between ICs of noun-noun relations and other traditional ICs. Hence, we devise a detailed syntactic and semantic typification process of the IC phenomenon in Arabic. We target the ATB as a platform for this classification. We render the ATB annotated with explicit IC labels but with the further semantic characterization which is useful for syntactic, semantic and cross language processing. Our typification of IC comprises 3 main syntactic IC types: FIC, GIC, and TIC, and they are further divided into 10 syntactic subclasses. The TIC group is further classified into semantic relations. We devise a method for automatic IC labeling and compare its yield against the CATiB treebank. Our evaluation shows that we achieve the same level of accuracy, but with the additional fine-grained classification into the various syntactic and semantic types.

PDF Abstract LREC 2016 PDF LREC 2016 Abstract
No code implementations yet. Submit your code now


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here