Search Results for author: Erik Velldal

Found 52 papers, 23 papers with code

Using Gender- and Polarity-Informed Models to Investigate Bias

no code implementations • ACL (GeBNLP) 2021 • Samia Touileb, Lilja Øvrelid, Erik Velldal

More specifically, we add information about the gender of critics and book authors when classifying the polarity of book reviews, and the polarity of the reviews when classifying the genders of authors and critics.

Language Modelling

Paper
Add Code

Negation in Norwegian: an annotated dataset

1 code implementation • NoDaLiDa 2021 • Petter Mæhlum, Jeremy Barnes, Robin Kurtz, Lilja Øvrelid, Erik Velldal

This paper introduces NorecNeg – the first annotated dataset of negation for Norwegian.

Negation Sentence

Paper
Code

Occupational Biases in Norwegian and Multilingual Language Models

1 code implementation • NAACL (GeBNLP) 2022 • Samia Touileb, Lilja Øvrelid, Erik Velldal

In this paper we explore how a demographic distribution of occupations, along gender dimensions, is reflected in pre-trained language models.

Descriptive

Paper
Code

Multilingual ELMo and the Effects of Corpus Sampling

no code implementations • NoDaLiDa 2021 • Vinit Ravishankar, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal

Multilingual pretrained language models are rapidly gaining popularity in NLP systems for non-English languages.

Paper
Add Code

Gender and sentiment, critics and authors: a dataset of Norwegian book reviews

1 code implementation • GeBNLP (COLING) 2020 • Samia Touileb, Lilja Øvrelid, Erik Velldal

We also explore the differences in how this is done by male and female critics.

Paper
Code

NARC – Norwegian Anaphora Resolution Corpus

1 code implementation • COLING (CRAC) 2022 • Petter Mæhlum, Dag Haug, Tollef Jørgensen, Andre Kåsen, Anders Nøklestad, Egil Rønningstad, Per Erik Solberg, Erik Velldal, Lilja Øvrelid

We present the Norwegian Anaphora Resolution Corpus (NARC), the first publicly available corpus annotated with anaphoric relations between noun phrases for Norwegian.

Relation

Paper
Code

SemEval 2022 Task 10: Structured Sentiment Analysis

no code implementations • SemEval (NAACL) 2022 • Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal

In this paper, we introduce the first SemEval shared task on Structured Sentiment Analysis, for which participants are required to predict all sentiment graphs in a text, where a single sentiment graph is composed of a sentiment holder, target, expression and polarity.

Sentiment Analysis

Paper
Add Code

Lexicon information in neural sentiment analysis: a multi-task learning approach

2 code implementations • WS (NoDaLiDa) 2019 • Jeremy Barnes, Samia Touileb, Lilja Øvrelid, Erik Velldal

This paper explores the use of multi-task learning (MTL) for incorporating external knowledge in neural models.

Multi-Task Learning Sentence +1

Paper
Code

Annotating evaluative sentences for sentiment analysis: a dataset for Norwegian

1 code implementation • WS (NoDaLiDa) 2019 • Petter Mæhlum, Jeremy Barnes, Lilja Øvrelid, Erik Velldal

This paper documents the creation of a large-scale dataset of evaluative sentences – i. e. both subjective and objective sentences that are found to be sentiment-bearing – based on mixed-domain professional reviews from various news-sources.

Sentiment Analysis

Paper
Code

Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

no code implementations • 5 Jun 2023 • Sondre Wold, Lilja Øvrelid, Erik Velldal

In contrast to large text corpora, knowledge graphs (KG) provide dense and structured representations of factual information.

Knowledge Graphs

Paper
Add Code

NorBench -- A Benchmark for Norwegian Language Models

1 code implementation • 6 May 2023 • David Samuel, Andrey Kutuzov, Samia Touileb, Erik Velldal, Lilja Øvrelid, Egil Rønningstad, Elina Sigdel, Anna Palatkina

We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics.

Paper
Code

Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

1 code implementation • COLING 2022 • Egil Rønningstad, Erik Velldal, Lilja Øvrelid

We show that sentiment in our dataset is expressed not only with an entity mention as target, but also towards targets with a sentiment-relevant relation to a volitional entity.

coreference-resolution Sentence +1

Paper
Code

Measuring Normative and Descriptive Biases in Language Models Using Census Data

no code implementations • 12 Apr 2023 • Samia Touileb, Lilja Øvrelid, Erik Velldal

We investigate in this paper how distributions of occupations with respect to gender is reflected in pre-trained language models.

Descriptive

Paper
Add Code

Trained on 100 million words and still in shape: BERT meets British National Corpus

2 code implementations • 17 Mar 2023 • David Samuel, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal

While modern masked language models (LMs) are trained on ever larger corpora, we here explore the effects of down-scaling training to a modestly-sized but representative, well-balanced, and publicly available English text source -- the British National Corpus.

Language Modelling

Paper
Code

Contextualized language models for semantic change detection: lessons learned

1 code implementation • 31 Aug 2022 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

Our findings show that contextualized methods can often predict high change scores for words which are not undergoing any real diachronic semantic shift in the lexicographic sense of the term (or at least the status of these shifts is questionable).

Change Detection

Paper
Code

Direct parsing to sentiment graphs

1 code implementation • ACL 2022 • David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal

This paper demonstrates how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text.

Sentiment Analysis

Paper
Code

Structured Sentiment Analysis as Dependency Graph Parsing

2 code implementations • ACL 2021 • Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal

Structured sentiment analysis attempts to extract full opinion tuples from a text, but over time this task has been subdivided into smaller and smaller sub-tasks, e, g,, target extraction or targeted polarity classification.

Sentiment Analysis

Paper
Code

Large-Scale Contextualised Language Modelling for Norwegian

2 code implementations • NoDaLiDa 2021 • Andrey Kutuzov, Jeremy Barnes, Erik Velldal, Lilja Øvrelid, Stephan Oepen

We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training.

Language Modelling

Paper
Code

If you've got it, flaunt it: Making the most of fine-grained sentiment annotations

no code implementations • EACL 2021 • Jeremy Barnes, Lilja Øvrelid, Erik Velldal

Fine-grained sentiment analysis attempts to extract sentiment holders, targets and polar expressions and resolve the relationship between them, but progress has been hampered by the difficulty of annotation.

General Classification Sentiment Analysis

Paper
Add Code

A Systematic Comparison of Architectures for Document-Level Sentiment Classification

1 code implementation • 19 Feb 2020 • Jeremy Barnes, Vinit Ravishankar, Lilja Øvrelid, Erik Velldal

Documents are composed of smaller pieces - paragraphs, sentences, and tokens - that have complex relationships between one another.

Classification Document Classification +5

Paper
Code

A Fine-Grained Sentiment Dataset for Norwegian

1 code implementation • LREC 2020 • Lilja Øvrelid, Petter Mæhlum, Jeremy Barnes, Erik Velldal

We introduce NoReC_fine, a dataset for fine-grained sentiment analysis in Norwegian, annotated with respect to polar expressions, targets and holders of opinion.

Sentiment Analysis

Paper
Code

NorNE: Annotating Named Entities for Norwegian

1 code implementation • LREC 2020 • Fredrik Jørgensen, Tobias Aasmoe, Anne-Stine Ruud Husevåg, Lilja Øvrelid, Erik Velldal

This paper presents NorNE, a manually annotated corpus of named entities which extends the annotation of the existing Norwegian Dependency Treebank.

Paper
Code

Multilingual Probing of Deep Pre-Trained Contextual Encoders

no code implementations • WS 2019 • Vinit Ravishankar, Memduh G{\"o}k{\i}rmak, Lilja {\O}vrelid, Erik Velldal

Encoders that generate representations based on context have, in recent years, benefited from adaptations that allow for pre-training on large text corpora.

Sentence

Paper
Add Code

Measuring Diachronic Evolution of Evaluative Adjectives with Word Embeddings: the Case for English, Norwegian, and Russian

no code implementations • WS 2019 • Julia Rodina, Baksh, Daria aeva, Vadim Fomin, Andrey Kutuzov, Samia Touileb, Erik Velldal

We measure the intensity of diachronic semantic shifts in adjectives in English, Norwegian and Russian across 5 decades.

Word Embeddings

Paper
Add Code

One-to-X analogical reasoning on word embeddings: a case for diachronic armed conflict prediction from news texts

1 code implementation • WS 2019 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

We extend the well-known word analogy task to a one-to-X formulation, including one-to-none cases, when no correct answer exists.

Word Embeddings

Paper
Code

Improving Sentiment Analysis with Multi-task Learning of Negation

1 code implementation • 18 Jun 2019 • Jeremy Barnes, Erik Velldal, Lilja Øvrelid

Sentiment analysis is directly affected by compositional phenomena in language that act on the prior polarity of the words and phrases found in the text.

Multi-Task Learning Negation +1

Paper
Code

Sentiment analysis is not solved! Assessing and probing sentiment classification

1 code implementation • WS 2019 • Jeremy Barnes, Lilja Øvrelid, Erik Velldal

Finally, we provide a case study that demonstrates the usefulness of the dataset to probe the performance of a given sentiment classifier with respect to linguistic phenomena.

Classification General Classification +3

Paper
Code

Probing Multilingual Sentence Representations With X-Probe

no code implementations • WS 2019 • Vinit Ravishankar, Lilja Øvrelid, Erik Velldal

This paper extends the task of probing sentence representations for linguistic insight in a multilingual domain.

Natural Language Inference Sentence

Paper
Add Code

Transfer and Multi-Task Learning for Noun--Noun Compound Interpretation

no code implementations • EMNLP 2018 • Murhaf Fares, Stephan Oepen, Erik Velldal

In this paper, we empirically evaluate the utility of transfer and multi-task learning on a challenging semantic classification task: semantic interpretation of noun{--}noun compounds.

General Classification Information Retrieval +3

Paper
Add Code

Transfer and Multi-Task Learning for Noun-Noun Compound Interpretation

1 code implementation • 18 Sep 2018 • Murhaf Fares, Stephan Oepen, Erik Velldal

In this paper, we empirically evaluate the utility of transfer and multi-task learning on a challenging semantic classification task: semantic interpretation of noun--noun compounds.

Classification General Classification +1

Paper
Code

Diachronic word embeddings and semantic shifts: a survey

no code implementations • COLING 2018 • Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski, Erik Velldal

Recent years have witnessed a surge of publications aimed at tracing temporal changes in lexical semantics using distributional methods, particularly prediction-based word embedding models.

Diachronic Word Embeddings Word Embeddings

Paper
Add Code

NoReC: The Norwegian Review Corpus

1 code implementation • LREC 2018 • Erik Velldal, Lilja Øvrelid, Eivind Alexander Bergem, Cathrine Stadsnes, Samia Touileb, Fredrik Jørgensen

As resources for sentiment analysis have so far been unavailable for Norwegian, NoReC represents a highly valuable and sought-after addition to Norwegian language technology.

Opinion Mining Sentiment Analysis

Paper
Code

Tracing armed conflicts with diachronic word embedding models

no code implementations • WS 2017 • Andrey Kutuzov, Erik Velldal, Lilja {\O}vrelid

Recent studies have shown that word embedding models can be used to trace time-related (diachronic) semantic shifts in particular words.

Word Embeddings

Paper
Add Code

Temporal dynamics of semantic relations in word embeddings: an application to predicting armed conflict participants

no code implementations • EMNLP 2017 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

This paper deals with using word embedding models to trace the temporal dynamics of semantic relations between pairs of words.

Word Embeddings

Paper
Add Code

Wordnet extension via word embeddings: Experiments on the Norwegian Wordnet

no code implementations • WS 2017 • Heidi Sand, Erik Velldal, Lilja Ãvrelid

POS Word Embeddings

Paper
Add Code

Optimizing a PoS Tagset for Norwegian Dependency Parsing

no code implementations • WS 2017 • Petter Hohle, Lilja {\O}vrelid, Erik Velldal

Dependency Parsing Feature Engineering +5

Paper
Add Code

Joint UD Parsing of Norwegian Bokm\aal and Nynorsk

1 code implementation • WS 2017 • Erik Velldal, Lilja {\O}vrelid, Petter Hohle

Language Identification Machine Translation

Paper
Code

Word vectors, reuse, and replicability: Towards a community repository of large-text resources

no code implementations • WS 2017 • Murhaf Fares, Andrey Kutuzov, Stephan Oepen, Erik Velldal

Semantic Textual Similarity Word Embeddings

Paper
Add Code

An open-source tool for negation detection: a maximum-margin approach

1 code implementation • WS 2017 • Martine Enger, Erik Velldal, Lilja {\O}vrelid

This paper presents an open-source toolkit for negation detection.

General Classification Negation +2

Paper
Code

Representation and Interchange of Linguistic Annotation. An In-Depth, Side-by-Side Comparison of Three Designs

no code implementations • WS 2017 • Richard Eckart de Castilho, Nancy Ide, Emanuele Lapponi, Stephan Oepen, Keith Suderman, Erik Velldal, Marc Verhagen

We expect that a more in-depth understanding of these choices across designs may led to increased harmonization, or at least to more informed design of future representations.

Paper
Add Code

Redefining part-of-speech classes with distributional semantic models

no code implementations • CONLL 2016 • Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries.

POS TAG +1

Paper
Add Code

OPT: Oslo--Potsdam--Teesside. Pipelining Rules, Rankers, and Classifier Ensembles for Shallow Discourse Parsing

no code implementations • CONLL 2016 • Stephan Oepen, Jonathon Read, Tatjana Scheffler, Uladzimir Sidarenka, Manfred Stede, Erik Velldal, Lilja {\O}vrelid

Discourse Parsing Machine Translation +2

Paper
Add Code

Threat detection in online discussions

no code implementations • WS 2016 • Aksel Wester, Lilja {\O}vrelid, Erik Velldal, Hugo Lewi Hammer

Word Sense Disambiguation

Paper
Add Code

A Corpus of Clinical Practice Guidelines Annotated with the Importance of Recommendations

no code implementations • LREC 2016 • Jonathon Read, Erik Velldal, Marc Cavazza, Gersende Georg

In this paper we present the Corpus of REcommendation STrength (CREST), a collection of HTML-formatted clinical guidelines annotated with the location of recommendations.

Paper
Add Code

Improving cross-domain dependency parsing with dependency-derived clusters

no code implementations • WS 2015 • Jostein Lien, Erik Velldal, Lilja {\O}vrelid

Dependency Parsing Domain Adaptation

Paper
Add Code

Predicting Party Affiliations from European Parliament Debates

no code implementations • WS 2014 • Bjørn Høyland, Jean-François Godbout, Emanuele Lapponi, Erik Velldal

Paper
Add Code

Off-Road LAF: Encoding and Processing Annotations in NLP Workflows

no code implementations • LREC 2014 • Emanuele Lapponi, Erik Velldal, Stephan Oepen, Rune Lain Knudsen

The Linguistic Annotation Framework (LAF) provides an abstract data model for specifying interchange representations to ensure interoperability among different annotation formats.

Part-Of-Speech Tagging