no code implementations • UDW (COLING) 2020 • Þórunn Arnardóttir, Hinrik Hafsteinsson, Einar Freyr Sigurðsson, Kristín Bjarnadóttir, Anton Karl Ingason, Hildur Jónsdóttir, Steinþór Steingrímsson
The topic of this paper is a rule-based pipeline for converting constituency treebanks based on the Penn Treebank format to Universal Dependencies (UD).
no code implementations • WS (NoDaLiDa) 2019 • Kristín Bjarnadóttir, Kristín Ingibjörg Hlynsdóttir, Steinþór Steingrímsson
The topic of this paper is The Database of Icelandic Morphology (DIM), a multipurpose linguistic resource, created for use in language technology, as a reference for the general public in Iceland, and for use in research on the Icelandic language.
1 code implementation • LREC 2020 • Jón Friðrik Daðason, David Erik Mollberg, Hrafn Loftsson, Kristín Bjarnadóttir
In this paper, we present a character-based BiLSTM model for splitting Icelandic compound words, and show how varying amounts of training data affects the performance of the model.
no code implementations • WS (NoDaLiDa) 2019 • Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson, Jón Friðrik Daðason, Kristín Bjarnadóttir
Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages.