Search Results for author: Ariel Ekgren

Found 7 papers, 1 papers with code

The Nordic Pile: A 1.2TB Nordic Dataset for Language Modeling

no code implementations30 Mar 2023 Joey Öhman, Severine Verlinden, Ariel Ekgren, Amaru Cuba Gyllensten, Tim Isbister, Evangelia Gogoulou, Fredrik Carlsson, Magnus Sahlgren

Pre-training Large Language Models (LLMs) require massive amounts of text data, and the performance of the LLMs typically correlates with the scale and quality of the datasets.

Language Modelling

Cross-lingual Transfer of Monolingual Models

no code implementations LREC 2022 Evangelia Gogoulou, Ariel Ekgren, Tim Isbister, Magnus Sahlgren

Additionally, the results of evaluating the transferred models in source language tasks reveal that their performance in the source domain deteriorates after transfer.

Cross-Lingual Transfer Domain Adaptation

SenseCluster at SemEval-2020 Task 1: Unsupervised Lexical Semantic Change Detection

no code implementations SEMEVAL 2020 Amaru Cuba Gyllensten, Evangelia Gogoulou, Ariel Ekgren, Magnus Sahlgren

We (Team Skurt) propose a simple method to detect lexical semantic change by clustering contextualized embeddings produced by XLM-R, using K-Means++.

Change Detection Clustering +1

Text Categorization for Conflict Event Annotation

no code implementations LREC 2020 Fredrik Olsson, Magnus Sahlgren, Fehmi ben Abdesslem, Ariel Ekgren, Kristine Eck

We cast the problem of event annotation as one of text categorization, and compare state of the art text categorization techniques on event data produced within the Uppsala Conflict Data Program (UCDP).

Text Categorization

Cannot find the paper you are looking for? You can Submit a new open access paper.