no code implementations • NAACL (maiworkshop) 2021 • Khalid Alnajjar, Mika Hämäläinen
We construct the first ever multimodal sarcasm dataset for Spanish.
no code implementations • NLP4DH (ICON) 2021 • Niko Partanen, Jack Rueter, Khalid Alnajjar, Mika Hämäläinen
The study forms a technical report of various tasks that have been performed on the materials collected and published by Finnish ethnographer and linguist, Matthias Alexander Castrén (1813–1852).
no code implementations • ComputEL (ACL) 2022 • Khalid Alnajjar, Mika Hämäläinen, Niko Tapio Partanen, Jack Rueter
Many endangered Uralic languages have multilingual machine readable dictionaries saved in an XML format.
no code implementations • ACL (LChange) 2021 • Niko Partanen, Khalid Alnajjar, Mika Hämäläinen, Jack Rueter
In this study, we have normalized and lemmatized an Old Literary Finnish corpus using a lemmatization model trained on texts from Agricola.
no code implementations • 24 May 2023 • Khalid Alnajjar, Mika Hämäläinen, Jack Rueter
Furthermore, we align these word embeddings and present a novel neural network model that is trained on English data to conduct sentiment analysis and then applied on endangered language data through the aligned word embeddings.
no code implementations • 15 Dec 2022 • Khalid Alnajjar, Mika Hämäläinen, Shuo Zhang
We present the first openly available multimodal metaphor annotated corpus.
no code implementations • 6 Dec 2022 • Khalid Alnajjar, Mika Hämäläinen
We present a DialGPT based model for generating creative dialog responses that are conditioned based on one of the following emotions: anger, disgust, fear, happiness, pain, sadness and surprise.
no code implementations • 6 Dec 2022 • Mika Hämäläinen, Khalid Alnajjar, Thierry Poibeau
We present a novel neural model for modern poetry generation in French.
no code implementations • 5 Dec 2022 • Maximilian Koppatz, Khalid Alnajjar, Mika Hämäläinen, Thierry Poibeau
We present a novel approach to generating news headlines in Finnish for a given news story.
no code implementations • 5 Dec 2022 • Mika Hämäläinen, Khalid Alnajjar, Thierry Poibeau
We conduct experiments on multilingual, multilabel sentiment analysis on the extracted data set using multilingual BERT, XLMRoBERTa and language specific BERT models.
no code implementations • COLING 2022 • Khalid Alnajjar, Mika Hämäläinen, Jörg Tiedemann, Jorma Laaksonen, Mikko Kurimo
Our results show that the model is capable of correctly detecting whether an utterance is humorous 78% of the time and how long the audience's laughter reaction should last with a mean absolute error of 600 milliseconds.
1 code implementation • 10 Jul 2022 • Teemu Pöyhönen, Mika Hämäläinen, Khalid Alnajjar
Role-playing games (RPGs) have a considerable amount of text in video game dialogues.
no code implementations • 16 May 2022 • Khalid Alnajjar, Mika Hämäläinen
Our approach consists of two steps, first we train a BERT model to predict a set of possible answers in a passage.
no code implementations • 28 Dec 2021 • Niko Partanen, Jack Rueter, Mika Hämäläinen, Khalid Alnajjar
The study forms a technical report of various tasks that have been performed on the materials collected and published by Finnish ethnographer and linguist, Matthias Alexander Castr\'en (1813-1852).
1 code implementation • NLP4DH (ICON) 2021 • Quan Duong, Mika Hämäläinen, Khalid Alnajjar
Measuring the semantic similarity of different texts has many important applications in Digital Humanities research such as information retrieval, document clustering and text summarization.
no code implementations • WNUT (ACL) 2021 • Mika Hämäläinen, Pattama Patpong, Khalid Alnajjar, Niko Partanen, Jack Rueter
We present the first openly available corpus for detecting depression in Thai.
1 code implementation • EMNLP 2021 • Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
Finnish is a language with multiple dialects that not only differ from each other in terms of accent (pronunciation) but also in terms of morphological forms and lexical choice.
no code implementations • ACL (IWCLUL) 2021 • Mika Hämäläinen, Khalid Alnajjar
There are a lot of tools and resources available for processing Finnish.
no code implementations • 17 Sep 2021 • Khalid Alnajjar, Mika Hämäläinen
Automated news generation has become a major interest for new agencies in the past.
no code implementations • 21 Aug 2021 • Mika Hämäläinen, Khalid Alnajjar, Niko Partanen
Based on our experiments, it is better to train a model with domain specific data than to use a pretrained model.
no code implementations • ACL (GEM) 2021 • Mika Hämäläinen, Khalid Alnajjar
We survey human evaluation in papers presenting work on creative natural language generation that have been published in INLG 2020 and ICCC 2020.
1 code implementation • JEP/TALN/RECITAL 2021 • Mika Hämäläinen, Niko Partanen, Khalid Alnajjar
Texts written in Old Literary Finnish represent the first literary work ever written in Finnish starting from the 16th century.
no code implementations • NAACL (NLP4IF) 2021 • Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter
However, a model fine-tuned on Multilingual BERT reaches the best factual label accuracy of 97. 2%.
1 code implementation • NoDaLiDa 2021 • Mika Hämäläinen, Niko Partanen, Jack Rueter, Khalid Alnajjar
We train neural models for morphological analysis, generation and lemmatization for morphologically rich languages.
no code implementations • 12 May 2021 • Khalid Alnajjar, Mika Hämäläinen
We construct the first ever multimodal sarcasm dataset for Spanish.
no code implementations • EACL (HumEval) 2021 • Mika Hämäläinen, Khalid Alnajjar
These results highlight that the Great Misalignment Problem is a major one and it affects the validity and reproducibility of results obtained by a human evaluation.
no code implementations • 24 Mar 2021 • Khalid Alnajjar
In this paper, we present a method for constructing word embeddings for endangered languages using existing word embeddings of different resource-rich languages and the translation dictionaries of resource-poor languages.
1 code implementation • 18 Mar 2021 • Khalid Alnajjar, Mika Hämäläinen
Every NLP researcher has to work with different XML or JSON encoded files.
1 code implementation • 9 Dec 2020 • Mika Hämäläinen, Niko Partanen, Khalid Alnajjar
Our study presents a dialect normalization method for different Finland Swedish dialects covering six regions.
1 code implementation • COLING 2020 • Khalid Alnajjar, Mika Hämäläinen, Jack Rueter, Niko Partanen
We present an open-source online dictionary editing system, Ve'rdd, that offers a chance to re-evaluate and edit grassroots dictionaries that have been exposed to multiple amateur editors.
1 code implementation • 11 Oct 2020 • Khalid Alnajjar, Mika Hämäläinen, Niko Partanen, Jack Rueter
This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic.
1 code implementation • 6 Sep 2020 • Mika Hämäläinen, Niko Partanen, Khalid Alnajjar, Jack Rueter, Thierry Poibeau
The models are tested with over 20 different dialects.
no code implementations • IJCNLP 2019 • Mika H{\"a}m{\"a}l{\"a}inen, Khalid Alnajjar
We present a novel approach for generating poetry automatically for the morphologically rich Finnish language by using a genetic algorithm.
1 code implementation • WS 2019 • Niko Partanen, Mika H{\"a}m{\"a}l{\"a}inen, Khalid Alnajjar
We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish.
1 code implementation • WS 2019 • Mika Hämäläinen, Khalid Alnajjar
We present a creative poem generator for the morphologically rich Finnish language.
1 code implementation • The 14th International Conference on the Foundations of Digital Games 2019 • Mika Hämäläinen, Khalid Alnajjar
Role playing games rely typically on hand-written dialog that has no flexibility in adapting to the game state such as the level of the player.
1 code implementation • The 14th International Conference on the Foundations of Digital Games 2019 • Khalid Alnajjar, Mika Hämäläinen
This software demonstration describes a mod for Fallout 4 that will adapt in-game dialog to the context of the current state of the game.
no code implementations • 10 Jul 2019 • Mika Hämäläinen, Khalid Alnajjar
This paper presents work on modelling the social psychological aspect of socialization in the case of a computationally creative master-apprentice system.
no code implementations • WS 2018 • Khalid Alnajjar, Mika H{\"a}m{\"a}l{\"a}inen
Satire has played a role in indirectly expressing critique towards an authority or a person from time immemorial.