no code implementations • LREC 2022 • Starkaður Barkarson, Steinþór Steingrímsson, Hildur Hafsteinsdóttir
We show how the corpus has grown almost 50% in size from the first version to the fourth and how it was restructured in order to better accommodate different meta-data for different subcorpora.
no code implementations • WS (NoDaLiDa) 2019 • Starkaður Barkarson, Steinþór Steingrímsson
We estimate that approximately 5% of the corpus data is noise or faulty alignments while more than 50% of the segments we deleted were faulty.