no code implementations • EAMT 2022 • Raúl Vázquez, Michele Boggia, Alessandro Raganato, Niki A. Loppi, Stig-Arne Grönroos, Jörg Tiedemann
We describe the enhancement of a multilingual NMT toolkit developed as part of the FoTran project.
no code implementations • WAT 2022 • Shantipriya Parida, Subhadarshi Panda, Stig-Arne Grönroos, Mark Granroth-Wilding, Mika Koistinen
This paper provides the system description of “Silo NLP’s” submission to the Workshop on Asian Translation (WAT2022).
no code implementations • NAACL (SIGMORPHON) 2022 • Aku Rouhe, Stig-Arne Grönroos, Sami Virpioja, Mathias Creutz, Mikko Kurimo
Our approach is to pre-segment the input data for a neural sequence-to-sequence model with the unsupervised method.
Ranked #1 on Morpheme Segmentaiton on UniMorph 4.0 (f1 macro avg (subtask 2) metric)
no code implementations • WMT (EMNLP) 2020 • Yves Scherrer, Stig-Arne Grönroos, Sami Virpioja
This paper describes the joint participation of University of Helsinki and Aalto University to two shared tasks of WMT 2020: the news translation between Inuktitut and English and the low-resource translation between German and Upper Sorbian.
1 code implementation • 12 Mar 2024 • Timothee Mickus, Stig-Arne Grönroos, Joseph Attieh, Michele Boggia, Ona de Gibert, Shaoxiong Ji, Niki Andreas Lopi, Alessandro Raganato, Raúl Vázquez, Jörg Tiedemann
NLP in the age of monolithic large language models is approaching its limits in terms of size and information that can be handled.
no code implementations • 5 Feb 2024 • Timothee Mickus, Stig-Arne Grönroos, Joseph Attieh
Whether embedding spaces use all their dimensions equally, i. e., whether they are isotropic, has been a recent subject of discussion.
no code implementations • 4 Dec 2022 • Jörg Tiedemann, Mikko Aulamo, Daria Bakshandaeva, Michele Boggia, Stig-Arne Grönroos, Tommi Nieminen, Alessandro Raganato, Yves Scherrer, Raul Vazquez, Sami Virpioja
This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows.
1 code implementation • 2 Aug 2022 • Shantipriya Parida, Subhadarshi Panda, Stig-Arne Grönroos, Mark Granroth-Wilding, Mika Koistinen
This paper provides the system description of "Silo NLP's" submission to the Workshop on Asian Translation (WAT2022).
1 code implementation • 8 Apr 2020 • Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
There are several approaches for improving neural machine translation for low-resource languages: Monolingual data can be exploited via pretraining or data augmentation; Parallel corpora on related language pairs can be used via parameter sharing or transfer learning in multilingual models; Subword segmentation and regularization techniques can be applied to ensure high coverage of the vocabulary.
no code implementations • 14 Mar 2020 • Abhilash Jain, Aku Ruohe, Stig-Arne Grönroos, Mikko Kurimo
Transformers have recently taken the center stage in language modeling after LSTM's were considered the dominant model architecture for a long time.
1 code implementation • LREC 2020 • Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
Using English, Finnish, North Sami, and Turkish data sets, we show that this approach is able to find better solutions to the optimization problem defined by the Morfessor Baseline model than its original recursive training algorithm.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 28 Nov 2019 • Umut Sulubacak, Ozan Caglayan, Stig-Arne Grönroos, Aku Rouhe, Desmond Elliott, Lucia Specia, Jörg Tiedemann
Multimodal machine translation involves drawing information from more than one modality, based on the assumption that the additional modalities will contain useful alternative views of the input data.
Ranked #4 on Multimodal Machine Translation on Multi30K
no code implementations • IWSLT (EMNLP) 2018 • Umut Sulubacak, Jörg Tiedemann, Aku Rouhe, Stig-Arne Grönroos, Mikko Kurimo
In this paper, we also describe the experiments leading up to our final systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • WS 2018 • Stig-Arne Grönroos, Sami Virpioja, Mikko Kurimo
This article describes the Aalto University entry to the WMT18 News Translation Shared Task.
no code implementations • WS 2018 • Stig-Arne Grönroos, Benoit Huet, Mikko Kurimo, Jorma Laaksonen, Bernard Merialdo, Phu Pham, Mats Sjöberg, Umut Sulubacak, Jörg Tiedemann, Raphael Troncy, Raúl Vázquez
Our experiments show that the effect of the visual features in our system is small.