Search Results for author: Samuel Rönnqvist

Found 14 papers, 5 papers with code

Multilingual and Zero-Shot is Closing in on Monolingual Web Register Classification

no code implementations • NoDaLiDa 2021 • Samuel Rönnqvist, Valtteri Skantsi, Miika Oinonen, Veronika Laippala

This article studies register classification of documents from the unrestricted web, such as news articles or opinion blogs, in a multilingual setting, exploring both the benefit of training on multiple languages and the capabilities for zero-shot cross-lingual transfer.

XLM-R Zero-Shot Cross-Lingual Transfer

Paper
Add Code

Towards better structured and less noisy Web data: Oscar with Register annotations

no code implementations • COLING (WNUT) 2022 • Veronika Laippala, Anna Salmela, Samuel Rönnqvist, Alham Fikri Aji, Li-Hsin Chang, Asma Dhifallah, Larissa Goulart, Henna Kortelainen, Marc Pàmies, Deise Prina Dutra, Valtteri Skantsi, Lintang Sutawika, Sampo Pyysalo

Web-crawled datasets are known to be noisy, as they feature a wide range of language use covering both user-generated and professionally edited content as well as noise originating from the crawling process.

Paper
Add Code

Explaining Classes through Stable Word Attributions

1 code implementation • Findings (ACL) 2022 • Samuel Rönnqvist, Aki-Juhani Kyröläinen, Amanda Myntti, Filip Ginter, Veronika Laippala

Input saliency methods have recently become a popular tool for explaining predictions of deep learning models in NLP.

text-classification Text Classification +1

Paper
Code

Explaining Classes through Word Attribution

no code implementations • 31 Aug 2021 • Samuel Rönnqvist, Amanda Myntti, Aki-Juhani Kyröläinen, Sampo Pyysalo, Veronika Laippala, Filip Ginter

In this work, we propose a method for explaining classes using deep learning models and the Integrated Gradients feature attribution technique by aggregating explanations of individual examples in text classification to general descriptions of the classes.

Genre classification text-classification +1

Paper
Add Code

Beyond the English Web: Zero-Shot Cross-Lingual and Lightweight Monolingual Classification of Registers

1 code implementation • EACL 2021 • Liina Repo, Valtteri Skantsi, Samuel Rönnqvist, Saara Hellström, Miika Oinonen, Anna Salmela, Douglas Biber, Jesse Egbert, Sampo Pyysalo, Veronika Laippala

We explore cross-lingual transfer of register classification for web documents.

Classification General Classification +1

Paper
Code

Morphological Tagging and Lemmatization of Albanian: A Manually Annotated Corpus and Neural Models

1 code implementation • 2 Dec 2019 • Nelda Kote, Marenglen Biba, Jenna Kanerva, Samuel Rönnqvist, Filip Ginter

In this paper, we present the first publicly available part-of-speech and morphologically tagged corpus for the Albanian language, as well as a neural morphological tagger and lemmatizer trained on it.

Lemmatization Morphological Tagging +1

Paper
Code

Is Multilingual BERT Fluent in Language Generation?

1 code implementation • WS 2019 • Samuel Rönnqvist, Jenna Kanerva, Tapio Salakoski, Filip Ginter

The multilingual BERT model is trained on 104 languages and meant to serve as a universal language model and tool for encoding sentences.

Language Modelling Sentence +1

Paper
Code

Template-free Data-to-Text Generation of Finnish Sports News

1 code implementation • WS (NoDaLiDa) 2019 • Jenna Kanerva, Samuel Rönnqvist, Riina Kekki, Tapio Salakoski, Filip Ginter

News articles such as sports game reports are often thought to closely follow the underlying game statistics, but in practice they contain a notable amount of background knowledge, interpretation, insight into the game, and quotes that are not present in the official statistics.

Data-to-Text Generation News Generation

Paper
Code

A Recurrent Neural Model with Attention for the Recognition of Chinese Implicit Discourse Relations

no code implementations • ACL 2017 • Samuel Rönnqvist, Niko Schenk, Christian Chiarcos

We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches.

Paper
Add Code

Bank distress in the news: Describing events through deep learning

no code implementations • 17 Mar 2016 • Samuel Rönnqvist, Peter Sarlin

While many models are purposed for detecting the occurrence of significant events in financial systems, the task of providing qualitative detail on the developments is not usually as well automated.

Descriptive

Paper
Add Code

Detect & Describe: Deep learning of bank stress in the news

no code implementations • 25 Jul 2015 • Samuel Rönnqvist, Peter Sarlin

We model bank distress with data on 243 events and 6. 6M news articles for 101 large European banks.

Paper
Add Code

Exploratory topic modeling with distributional semantics

no code implementations • 16 Jul 2015 • Samuel Rönnqvist

As we continue to collect and store textual data in a multitude of domains, we are regularly confronted with material whose largely unknown thematic structure we want to uncover.

Paper
Add Code

Interactive Visual Exploration of Topic Models using Graphs

no code implementations • 19 Sep 2014 • Samuel Rönnqvist, Xiaolu Wang, Peter Sarlin

Probabilistic topic modeling is a popular and powerful family of tools for uncovering thematic structure in large sets of unstructured text documents.

Descriptive Information Retrieval +2

Paper
Add Code

Cluster coloring of the Self-Organizing Map: An information visualization perspective

no code implementations • 17 Jun 2013 • Peter Sarlin, Samuel Rönnqvist

From the viewpoint of information visualization, this paper provides a general, yet simple, solution to projection-based coloring of the SOM that reveals structures.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.