Well-Defined Morphology is Sentence-Level Morphology

EMNLP (MRL) 2021  ·  Omer Goldman, Reut Tsarfaty ·

Morphological tasks have gained decent popularity within the NLP community in the recent years, with large multi-lingual datasets providing morphological analysis of words, either in or out of context. However, the lack of a clear linguistic definition for words destines the annotative work to be incomplete and mired in inconsistencies, especially cross-linguistically. In this work we expand morphological inflection of words to inflection of sentences to provide true universality disconnected from orthographic traditions of white-space usage. To allow annotation for sentence-inflection we define a morphological annotation scheme by a fixed set of inflectional features. We present a small cross-linguistic dataset including semi-manually generated simple sentences in 4 typologically diverse languages annotated according to our suggested scheme, and show that the task of reinflection gets substantially more difficult but that the change of scope from words to well-defined sentences allows interface with contextualized language models.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here