Search Results for author: Hrafn Loftsson

Found 22 papers, 5 papers with code

Pre-training and Evaluating Transformer-based Language Models for Icelandic

no code implementations LREC 2022 Jón Guðnason, Hrafn Loftsson

We pre-train four types of monolingual ELECTRA and ConvBERT models and compare our results to a previously trained monolingual RoBERTa model and the multilingual mBERT model.

Dependency Parsing named-entity-recognition +4

Compiling a Highly Accurate Bilingual Lexicon by Combining Different Approaches

no code implementations gwll (LREC) 2022 Steinþór Steingrímsson, Luke O’Brien, Finnur Ingimundarson, Hrafn Loftsson, Andy Way

By combining the most promising approaches and data sets, using confidence scores calculated from the data and the results of manually evaluating samples from our manual evaluation as indicators, we are able to induce lists of translations with a very high acceptance rate.

Cross-Lingual Word Embeddings Machine Translation +1

Towards High Accuracy Named Entity Recognition for Icelandic

no code implementations WS (NoDaLiDa) 2019 Svanhvít Lilja Ingólfsdóttir, Sigurjón Þorsteinsson, Hrafn Loftsson

We report on work in progress which consists of annotating an Icelandic corpus for named entities (NEs) and using it for training a named entity recognizer based on a Bidirectional Long Short-Term Memory model.

Miscellaneous named-entity-recognition +5

SentAlign: Accurate and Scalable Sentence Alignment

1 code implementation15 Nov 2023 Steinþór Steingrímsson, Hrafn Loftsson, Andy Way

We present SentAlign, an accurate sentence alignment tool designed to handle very large parallel document pairs.

Machine Translation Sentence +1

Building an Icelandic Entity Linking Corpus

no code implementations DCLRL (LREC) 2022 Steinunn Rut Friðriksdóttir, Valdimar Ágúst Eggertsson, Benedikt Geir Jóhannesson, Hjalti Daníelsson, Hrafn Loftsson, Hafsteinn Einarsson

We describe our approach of using a multilingual entity linking model (mGENRE) in combination with Wikipedia API Search (WAPIS) to label our data and compare it to an approach using WAPIS only.

Entity Linking

Semi-self-supervised Automated ICD Coding

no code implementations20 May 2022 Hlynur D. Hlynsson, Steindór Ellertsson, Jón F. Daðason, Emil L. Sigurdsson, Hrafn Loftsson

Clinical Text Notes (CTNs) contain physicians' reasoning process, written in an unstructured free text format, as they examine and interview patients.

Data Augmentation Imputation

Effectively Aligning and Filtering Parallel Corpora under Sparse Data Conditions

no code implementations ACL 2020 Stein{\th}{\'o}r Steingr{\'\i}msson, Hrafn Loftsson, Andy Way

When rich morphology exacerbates the data sparsity problem, it is imperative to have accurate alignment and filtering methods that can help make the most of what is available by maximising the number of correctly translated segments in a corpus and minimising noise by removing incorrect translations and segments containing extraneous data.

Machine Translation Translation

Kvistur 2.0: a BiLSTM Compound Splitter for Icelandic

1 code implementation LREC 2020 Jón Friðrik Daðason, David Erik Mollberg, Hrafn Loftsson, Kristín Bjarnadóttir

In this paper, we present a character-based BiLSTM model for splitting Icelandic compound words, and show how varying amounts of training data affects the performance of the model.

Part-Of-Speech Tagging

A Wide-Coverage Context-Free Grammar for Icelandic and an Accompanying Parsing System

no code implementations RANLP 2019 Vilhj{\'a}lmur {\TH}orsteinsson, Hulda {\'O}lad{\'o}ttir, Hrafn Loftsson

Our parsing system is able to parse about 90{\%} of all sentences in articles published on the main Icelandic news websites.

Nefnir: A high accuracy lemmatizer for Icelandic

no code implementations WS (NoDaLiDa) 2019 Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson, Jón Friðrik Daðason, Kristín Bjarnadóttir

Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages.

Lemmatization POS +1

Cannot find the paper you are looking for? You can Submit a new open access paper.