no code implementations • LREC 2022 • Vésteinn Snæbjarnarson, Haukur Barri Símonarson, Pétur Orri Ragnarsson, Svanhvít Lilja Ingólfsdóttir, Haukur Jónsson, Vilhjalmur THorsteinsson, Hafsteinn Einarsson
To train the models we introduce a new corpus of Icelandic text, the Icelandic Common Crawl Corpus (IC3), a collection of high quality texts found online by targeting the Icelandic top-level-domain . is.
no code implementations • LREC 2022 • Vésteinn Snæbjarnarson, Hafsteinn Einarsson
The dataset is a valuable resource for Icelandic which we demonstrate by creating and evaluating a system capable of extractive QA in Icelandic.
no code implementations • NIDCP (LREC) 2022 • Steinunn Rut Friðriksdóttir, Hafsteinn Einarsson
In this paper, we present a novel approach to data collection for natural language processing (NLP), linguistic research and lexicographic work.
no code implementations • NAACL (MIA) 2022 • Vésteinn Snæbjarnarson, Hafsteinn Einarsson
Our approach requires only limited QA resources in the given language, along with machine-translated data, and at least a bilingual language model.
no code implementations • DCLRL (LREC) 2022 • Steinunn Rut Friðriksdóttir, Valdimar Ágúst Eggertsson, Benedikt Geir Jóhannesson, Hjalti Daníelsson, Hrafn Loftsson, Hafsteinn Einarsson
We describe our approach of using a multilingual entity linking model (mGENRE) in combination with Wikipedia API Search (WAPIS) to label our data and compare it to an approach using WAPIS only.
no code implementations • 14 Jan 2022 • Vésteinn Snæbjarnarson, Haukur Barri Símonarson, Pétur Orri Ragnarsson, Svanhvít Lilja Ingólfsdóttir, Haukur Páll Jónsson, Vilhjálmur Þorsteinsson, Hafsteinn Einarsson
To train the models we introduce a new corpus of Icelandic text, the Icelandic Common Crawl Corpus (IC3), a collection of high quality texts found online by targeting the Icelandic top-level-domain (TLD).
no code implementations • 16 Aug 2018 • Hafsteinn Einarsson, Marcelo Matheus Gauy, Johannes Lengler, Florian Meier, Asier Mujika, Angelika Steger, Felix Weissenberger
For the first setup, we give a schedule that achieves a runtime of $(1\pm o(1))\beta n \ln n$, where $\beta \approx 3. 552$, which is an asymptotic improvement over the runtime of the static setup.