Search Results for author: Starkaður Barkarson

Found 2 papers, 0 papers with code

Evolving Large Text Corpora: Four Versions of the Icelandic Gigaword Corpus

no code implementations LREC 2022 Starkaður Barkarson, Steinþór Steingrímsson, Hildur Hafsteinsdóttir

We show how the corpus has grown almost 50% in size from the first version to the fourth and how it was restructured in order to better accommodate different meta-data for different subcorpora.

Word Embeddings

Compiling and Filtering ParIce: An English-Icelandic Parallel Corpus

no code implementations WS (NoDaLiDa) 2019 Starkaður Barkarson, Steinþór Steingrímsson

We estimate that approximately 5% of the corpus data is noise or faulty alignments while more than 50% of the segments we deleted were faulty.

Cannot find the paper you are looking for? You can Submit a new open access paper.