Search Results for author: Elaine Zosa

Found 15 papers, 3 papers with code

Multilingual and Multimodal Topic Modelling with Pretrained Embeddings

1 code implementation COLING 2022 Elaine Zosa, Lidia Pivovarova

This paper presents M3L-Contrast -- a novel multimodal multilingual (M3L) neural topic model for comparable data that maps texts from multiple languages and images into a shared topic space.

Capturing Evolution in Word Usage: Just Add More Clusters?

no code implementations18 Jan 2020 Matej Martinc, Syrielle Montariol, Elaine Zosa, Lidia Pivovarova

The way the words are used evolves through time, mirroring cultural or technological evolution of society.

Change Detection

Word Clustering for Historical Newspapers Analysis

no code implementations RANLP 2019 Lidia Pivovarova, Elaine Zosa, Jani Marjanen

This paper is a part of a collaboration between computer scientists and historians aimed at development of novel tools and methods to improve analysis of historical newspapers.

Clustering

Multilingual Dynamic Topic Model

no code implementations RANLP 2019 Elaine Zosa, Mark Granroth-Wilding

Dynamic topic models (DTMs) capture the evolution of topics and trends in time series data. Current DTMs are applicable only to monolingual datasets.

Time Series Time Series Analysis +1

Topic modelling discourse dynamics in historical newspapers

no code implementations20 Nov 2020 Jani Marjanen, Elaine Zosa, Simon Hengchen, Lidia Pivovarova, Mikko Tolonen

This paper addresses methodological issues in diachronic data analysis for historical research.

Topic Models

Interesting cross-border news discovery using cross-lingual article linking and document similarity

no code implementations EACL (Hackashop) 2021 Boshko Koloski, Elaine Zosa, Timen Stepišnik-Perdih, Blaž Škrlj, Tarmo Paju, Senja Pollak

Team Name: team-8 Embeddia Tool: Cross-Lingual Document Retrieval Zosa et al. Dataset: Estonian and Latvian news datasets abstract: Contemporary news media face increasing amounts of available data that can be of use when prioritizing, selecting and discovering new news.

Retrieval

Grounded and Well-rounded: A Methodological Approach to the Study of Cross-modal and Cross-lingual Grounding

no code implementations18 Oct 2023 Timothee Mickus, Elaine Zosa, Denis Paperno

Grounding has been argued to be a crucial component towards the development of more complete and truly semantically competent artificial intelligence systems.

SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

no code implementations12 Mar 2024 Timothee Mickus, Elaine Zosa, Raúl Vázquez, Teemu Vahtola, Jörg Tiedemann, Vincent Segonne, Alessandro Raganato, Marianna Apidianaki

This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate.

Machine Translation Paraphrase Generation

Poro 34B and the Blessing of Multilinguality

no code implementations2 Apr 2024 Risto Luukkonen, Jonathan Burdge, Elaine Zosa, Aarne Talman, Ville Komulainen, Väinö Hatanpää, Peter Sarlin, Sampo Pyysalo

The pretraining of state-of-the-art large language models now requires trillions of words of text, which is orders of magnitude more than available for the vast majority of languages.

Cannot find the paper you are looking for? You can Submit a new open access paper.