Universal Lemmatizer: A Sequence to Sequence Model for Lemmatizing Universal Dependencies Treebanks

3 Feb 2019Jenna KanervaFilip GinterTapio Salakoski

In this paper we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer generates the lemma one character at a time based on the surface form characters and its morphosyntactic features obtained from a morphological tagger... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.