1 code implementation • NoDaLiDa 2021 • Petter Mæhlum, Jeremy Barnes, Robin Kurtz, Lilja Øvrelid, Erik Velldal
This paper introduces NorecNeg – the first annotated dataset of negation for Norwegian.
no code implementations • NoDaLiDa 2021 • Vinit Ravishankar, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal
Multilingual pretrained language models are rapidly gaining popularity in NLP systems for non-English languages.
1 code implementation • GeBNLP (COLING) 2020 • Samia Touileb, Lilja Øvrelid, Erik Velldal
We also explore the differences in how this is done by male and female critics.
2 code implementations • WS (NoDaLiDa) 2019 • Jeremy Barnes, Samia Touileb, Lilja Øvrelid, Erik Velldal
This paper explores the use of multi-task learning (MTL) for incorporating external knowledge in neural models.
1 code implementation • WS (NoDaLiDa) 2019 • Petter Mæhlum, Jeremy Barnes, Lilja Øvrelid, Erik Velldal
This paper documents the creation of a large-scale dataset of evaluative sentences – i. e. both subjective and objective sentences that are found to be sentiment-bearing – based on mixed-domain professional reviews from various news-sources.
no code implementations • ACL (GeBNLP) 2021 • Samia Touileb, Lilja Øvrelid, Erik Velldal
More specifically, we add information about the gender of critics and book authors when classifying the polarity of book reviews, and the polarity of the reviews when classifying the genders of authors and critics.
1 code implementation • COLING (CRAC) 2022 • Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, Lilja Øvrelid
We present the Norwegian Anaphora Resolution Corpus (NARC), the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian.
no code implementations • SemEval (NAACL) 2022 • Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal
In this paper, we introduce the first SemEval shared task on Structured Sentiment Analysis, for which participants are required to predict all sentiment graphs in a text, where a single sentiment graph is composed of a sentiment holder, target, expression and polarity.
3 code implementations • NAACL (GeBNLP) 2022 • Samia Touileb, Lilja Øvrelid, Erik Velldal
In this paper we explore how a demographic distribution of occupations, along gender dimensions, is reflected in pre-trained language models.
1 code implementation • 10 Apr 2025 • Vladislav Mikhailov, Tita Enstad, David Samuel, Hans Christian Farsethås, Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
We describe the NorEval design and present the results of benchmarking 19 open-source pre-trained and instruction-tuned LMs for Norwegian in various scenarios.
no code implementations • 31 Jan 2025 • Egil Rønningstad, Lilja Charlotte Storset, Petter Mæhlum, Lilja Øvrelid, Erik Velldal
Sentiment analysis of patient feedback from the public health domain can aid decision makers in evaluating the provided services.
no code implementations • 19 Jan 2025 • Vladislav Mikhailov, Petter Mæhlum, Victoria Ovedie Chruickshank Langø, Erik Velldal, Lilja Øvrelid
This paper introduces a new suite of question answering datasets for Norwegian; NorOpenBookQA, NorCommonSenseQA, NorTruthfulQA, and NRK-Quiz-QA.
no code implementations • 13 Jan 2025 • Samia Touileb, Vladislav Mikhailov, Marie Kroka, Lilja Øvrelid, Erik Velldal
We introduce a dataset of high-quality human-authored summaries of news articles in Norwegian.
no code implementations • 12 Dec 2024 • Javier de la Rosa, Vladislav Mikhailov, Lemei Zhang, Freddy Wetjen, David Samuel, Peng Liu, Rolv-Arild Braaten, Petter Mæhlum, Magnus Breder Birkenes, Andrey Kutuzov, Tita Enstad, Hans Christian Farsethås, Svein Arne Brygfjeld, Jon Atle Gulla, Stephan Oepen, Erik Velldal, Wilfred Østgulen, Liljia Øvrelid, Aslak Sira Myhre
The use of copyrighted materials in training language models raises critical legal and ethical questions.
no code implementations • 9 Dec 2024 • David Samuel, Vladislav Mikhailov, Erik Velldal, Lilja Øvrelid, Lucas Georges Gabriel Charpentier, Andrey Kutuzov, Stephan Oepen
Training large language models requires vast amounts of data, posing a challenge for less widely spoken languages like Norwegian and even more so for truly low-resource languages like Northern S\'ami.
1 code implementation • 4 Jul 2024 • Egil Rønningstad, Roman Klinger, Lilja Øvrelid, Erik Velldal
In order to better understand how sentiment regarding persons and organizations (each entity in our scope) is expressed in longer texts, we have collected a dataset of expert annotations where the overall sentiment regarding each entity is identified, together with the sentence-level sentiment for these entities separately.
1 code implementation • 7 Jun 2024 • Sondre Wold, Étienne Simon, Lucas Georges Gabriel Charpentier, Egor V. Kostylev, Erik Velldal, Lilja Øvrelid
Grounded language models use external sources of information, such as knowledge graphs, to meet some of the general challenges associated with pre-training.
no code implementations • 29 Apr 2024 • Petter Mæhlum, David Samuel, Rebecka Maria Norman, Elma Jelin, Øyvind Andresen Bjertnæs, Lilja Øvrelid, Erik Velldal
Sentiment analysis is an important tool for aggregating patient voices, in order to provide targeted improvements in healthcare services.
no code implementations • 5 Jun 2023 • Sondre Wold, Lilja Øvrelid, Erik Velldal
In contrast to large text corpora, knowledge graphs (KG) provide dense and structured representations of factual information.
1 code implementation • 6 May 2023 • David Samuel, Andrey Kutuzov, Samia Touileb, Erik Velldal, Lilja Øvrelid, Egil Rønningstad, Elina Sigdel, Anna Palatkina
We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics.
1 code implementation • COLING 2022 • Egil Rønningstad, Erik Velldal, Lilja Øvrelid
We show that sentiment in our dataset is expressed not only with an entity mention as target, but also towards targets with a sentiment-relevant relation to a volitional entity.
no code implementations • 12 Apr 2023 • Samia Touileb, Lilja Øvrelid, Erik Velldal
We investigate in this paper how distributions of occupations with respect to gender is reflected in pre-trained language models.
2 code implementations • 17 Mar 2023 • David Samuel, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal
While modern masked language models (LMs) are trained on ever larger corpora, we here explore the effects of down-scaling training to a modestly-sized but representative, well-balanced, and publicly available English text source -- the British National Corpus.
1 code implementation • 31 Aug 2022 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift in the lexicographic sense of the term (or at least the status of these shifts is questionable).
1 code implementation • ACL 2022 • David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal
This paper demonstrates how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text.
2 code implementations • ACL 2021 • Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal
Structured sentiment analysis attempts to extract full opinion tuples from a text, but over time this task has been subdivided into smaller and smaller sub-tasks, e, g,, target extraction or targeted polarity classification.
2 code implementations • NoDaLiDa 2021 • Andrey Kutuzov, Jeremy Barnes, Erik Velldal, Lilja Øvrelid, Stephan Oepen
We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training.
no code implementations • EACL 2021 • Jeremy Barnes, Lilja Øvrelid, Erik Velldal
Fine-grained sentiment analysis attempts to extract sentiment holders, targets and polar expressions and resolve the relationship between them, but progress has been hampered by the difficulty of annotation.
1 code implementation • 19 Feb 2020 • Jeremy Barnes, Vinit Ravishankar, Lilja Øvrelid, Erik Velldal
Documents are composed of smaller pieces - paragraphs, sentences, and tokens - that have complex relationships between one another.
1 code implementation • LREC 2020 • Lilja Øvrelid, Petter Mæhlum, Jeremy Barnes, Erik Velldal
We introduce NoReC_fine, a dataset for fine-grained sentiment analysis in Norwegian, annotated with respect to polar expressions, targets and holders of opinion.
1 code implementation • LREC 2020 • Fredrik Jørgensen, Tobias Aasmoe, Anne-Stine Ruud Husevåg, Lilja Øvrelid, Erik Velldal
This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank.
no code implementations • WS 2019 • Vinit Ravishankar, Memduh G{\"o}k{\i}rmak, Lilja {\O}vrelid, Erik Velldal
Encoders that generate representations based on context have, in recent years, benefited from adaptations that allow for pre-training on large text corpora.
no code implementations • WS 2019 • Julia Rodina, Baksh, Daria aeva, Vadim Fomin, Andrey Kutuzov, Samia Touileb, Erik Velldal
We measure the intensity of diachronic semantic shifts in adjectives in English, Norwegian and Russian across 5 decades.
1 code implementation • WS 2019 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
We extend the well-known word analogy task to a one-to-X formulation, including one-to-none cases, when no correct answer exists.
1 code implementation • 18 Jun 2019 • Jeremy Barnes, Erik Velldal, Lilja Øvrelid
Sentiment analysis is directly affected by compositional phenomena in language that act on the prior polarity of the words and phrases found in the text.
1 code implementation • WS 2019 • Jeremy Barnes, Lilja Øvrelid, Erik Velldal
Finally, we provide a case study that demonstrates the usefulness of the dataset to probe the performance of a given sentiment classifier with respect to linguistic phenomena.
no code implementations • WS 2019 • Vinit Ravishankar, Lilja Øvrelid, Erik Velldal
This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain.
no code implementations • EMNLP 2018 • Murhaf Fares, Stephan Oepen, Erik Velldal
In this paper, we empirically evaluate the utility of transfer and multi-task learning on a challenging semantic classification task: semantic interpretation of noun{--}noun compounds.
1 code implementation • 18 Sep 2018 • Murhaf Fares, Stephan Oepen, Erik Velldal
In this paper, we empirically evaluate the utility of transfer and multi-task learning on a challenging semantic classification task: semantic interpretation of noun--noun compounds.
no code implementations • COLING 2018 • Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski, Erik Velldal
Recent years have witnessed a surge of publications aimed at tracing temporal changes in lexical semantics using distributional methods, particularly prediction-based word embedding models.
1 code implementation • LREC 2018 • Erik Velldal, Lilja Øvrelid, Eivind Alexander Bergem, Cathrine Stadsnes, Samia Touileb, Fredrik Jørgensen
As resources for sentiment analysis have so far been unavailable for Norwegian, NoReC represents a highly valuable and sought-after addition to Norwegian language technology.
no code implementations • WS 2017 • Andrey Kutuzov, Erik Velldal, Lilja {\O}vrelid
Recent studies have shown that word embedding models can be used to trace time-related (diachronic) semantic shifts in particular words.
no code implementations • EMNLP 2017 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
This paper deals with using word embedding models to trace the temporal dynamics of semantic relations between pairs of words.
no code implementations • WS 2017 • Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen
We expect that a more in-depth understanding of these choices across designs may led to increased harmonization, or at least to more informed design of future representations.
no code implementations • WS 2017 • Martine Enger, Erik Velldal, Lilja {\O}vrelid
This paper presents an open-source toolkit for negation detection.
no code implementations • CONLL 2016 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid
This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries.
no code implementations • LREC 2016 • Jonathon Read, Erik Velldal, Marc Cavazza, Gersende Georg
In this paper we present the Corpus of REcommendation STrength (CREST), a collection of HTML-formatted clinical guidelines annotated with the location of recommendations.
no code implementations • LREC 2014 • Emanuele Lapponi, Erik Velldal, Stephan Oepen, Rune Lain Knudsen
The Linguistic Annotation Framework (LAF) provides an abstract data model for specifying interchange representations to ensure interoperability among different annotation formats.