Search Results for author: Khalid Alnajjar

Found 40 papers, 14 papers with code

Processing M.A. Castrén’s Materials: Multilingual Historical Typed and Handwritten Manuscripts

no code implementations NLP4DH (ICON) 2021 Niko Partanen, Jack Rueter, Khalid Alnajjar, Mika Hämäläinen

The study forms a technical report of various tasks that have been performed on the materials collected and published by Finnish ethnographer and linguist, Matthias Alexander Castrén (1813–1852).

Linguistic change and historical periodization of Old Literary Finnish

no code implementations ACL (LChange) 2021 Niko Partanen, Khalid Alnajjar, Mika Hämäläinen, Jack Rueter

In this study, we have normalized and lemmatized an Old Literary Finnish corpus using a lemmatization model trained on texts from Agricola.

Lemmatization Word Embeddings

Sentiment Analysis Using Aligned Word Embeddings for Uralic Languages

no code implementations24 May 2023 Khalid Alnajjar, Mika Hämäläinen, Jack Rueter

Furthermore, we align these word embeddings and present a novel neural network model that is trained on English data to conduct sentiment analysis and then applied on endangered language data through the aligned word embeddings.

Sentiment Analysis Word Embeddings

Emotion Conditioned Creative Dialog Generation

no code implementations6 Dec 2022 Khalid Alnajjar, Mika Hämäläinen

We present a DialGPT based model for generating creative dialog responses that are conditioned based on one of the following emotions: anger, disgust, fear, happiness, pain, sadness and surprise.

Sentence

Video Games as a Corpus: Sentiment Analysis using Fallout New Vegas Dialog

no code implementations5 Dec 2022 Mika Hämäläinen, Khalid Alnajjar, Thierry Poibeau

We conduct experiments on multilingual, multilabel sentiment analysis on the extracted data set using multilingual BERT, XLMRoBERTa and language specific BERT models.

Sentiment Analysis

When to Laugh and How Hard? A Multimodal Approach to Detecting Humor and its Intensity

no code implementations COLING 2022 Khalid Alnajjar, Mika Hämäläinen, Jörg Tiedemann, Jorma Laaksonen, Mikko Kurimo

Our results show that the model is capable of correctly detecting whether an utterance is humorous 78% of the time and how long the audience's laughter reaction should last with a mean absolute error of 600 milliseconds.

Multilingual Persuasion Detection: Video Games as an Invaluable Data Source for NLP

1 code implementation10 Jul 2022 Teemu Pöyhönen, Mika Hämäläinen, Khalid Alnajjar

Role-playing games (RPGs) have a considerable amount of text in video game dialogues.

Harnessing Multilingual Resources to Question Answering in Arabic

no code implementations16 May 2022 Khalid Alnajjar, Mika Hämäläinen

Our approach consists of two steps, first we train a BERT model to predict a set of possible answers in a passage.

Question Answering

Processing M.A. Castrén's Materials: Multilingual Typed and Handwritten Manuscripts

no code implementations28 Dec 2021 Niko Partanen, Jack Rueter, Mika Hämäläinen, Khalid Alnajjar

The study forms a technical report of various tasks that have been performed on the materials collected and published by Finnish ethnographer and linguist, Matthias Alexander Castr\'en (1813-1852).

TFW2V: An Enhanced Document Similarity Method for the Morphologically Rich Finnish Language

1 code implementation NLP4DH (ICON) 2021 Quan Duong, Mika Hämäläinen, Khalid Alnajjar

Measuring the semantic similarity of different texts has many important applications in Digital Humanities research such as information retrieval, document clustering and text summarization.

Benchmarking Clustering +6

Finnish Dialect Identification: The Effect of Audio and Text

1 code implementation EMNLP 2021 Mika Hämäläinen, Khalid Alnajjar, Niko Partanen, Jack Rueter

Finnish is a language with multiple dialects that not only differ from each other in terms of accent (pronunciation) but also in terms of morphological forms and lexical choice.

Dialect Identification

The Current State of Finnish NLP

no code implementations ACL (IWCLUL) 2021 Mika Hämäläinen, Khalid Alnajjar

There are a lot of tools and resources available for processing Finnish.

Human Evaluation of Creative NLG Systems: An Interdisciplinary Survey on Recent Papers

no code implementations ACL (GEM) 2021 Mika Hämäläinen, Khalid Alnajjar

We survey human evaluation in papers presenting work on creative natural language generation that have been published in INLG 2020 and ICCC 2020.

Text Generation

Lemmatization of Historical Old Literary Finnish Texts in Modern Orthography

1 code implementation JEP/TALN/RECITAL 2021 Mika Hämäläinen, Niko Partanen, Khalid Alnajjar

Texts written in Old Literary Finnish represent the first literary work ever written in Finnish starting from the 16th century.

Lemmatization

The Great Misalignment Problem in Human Evaluation of NLP Methods

no code implementations EACL (HumEval) 2021 Mika Hämäläinen, Khalid Alnajjar

These results highlight that the Great Misalignment Problem is a major one and it affects the validity and reproducibility of results obtained by a human evaluation.

When Word Embeddings Become Endangered

no code implementations24 Mar 2021 Khalid Alnajjar

In this paper, we present a method for constructing word embeddings for endangered languages using existing word embeddings of different resource-rich languages and the translation dictionaries of resource-poor languages.

Cross-Lingual Word Embeddings Sentiment Analysis +2

Corpona – The Pythonic Way of Processing Corpora

1 code implementation18 Mar 2021 Khalid Alnajjar, Mika Hämäläinen

Every NLP researcher has to work with different XML or JSON encoded files.

Normalization of Different Swedish Dialects Spoken in Finland

1 code implementation9 Dec 2020 Mika Hämäläinen, Niko Partanen, Khalid Alnajjar

Our study presents a dialect normalization method for different Finland Swedish dialects covering six regions.

Ve'rdd. Narrowing the Gap between Paper Dictionaries, Low-Resource NLP and Community Involvement

1 code implementation COLING 2020 Khalid Alnajjar, Mika Hämäläinen, Jack Rueter, Niko Partanen

We present an open-source online dictionary editing system, Ve'rdd, that offers a chance to re-evaluate and edit grassroots dictionaries that have been exposed to multiple amateur editors.

Automated Prediction of Medieval Arabic Diacritics

1 code implementation11 Oct 2020 Khalid Alnajjar, Mika Hämäläinen, Niko Partanen, Jack Rueter

This study uses a character level neural machine translation approach trained on a long short-term memory-based bi-directional recurrent neural network architecture for diacritization of Medieval Arabic.

Machine Translation Translation

Generating Modern Poetry Automatically in Finnish

no code implementations IJCNLP 2019 Mika H{\"a}m{\"a}l{\"a}inen, Khalid Alnajjar

We present a novel approach for generating poetry automatically for the morphologically rich Finnish language by using a genetic algorithm.

Dialect Text Normalization to Normative Standard Finnish

1 code implementation WS 2019 Niko Partanen, Mika H{\"a}m{\"a}l{\"a}inen, Khalid Alnajjar

We compare different LSTMs and transformer models in terms of their effectiveness in normalizing dialectal Finnish into the normative standard Finnish.

Let's FACE it. Finnish Poetry Generation with Aesthetics and Framing

1 code implementation WS 2019 Mika Hämäläinen, Khalid Alnajjar

We present a creative poem generator for the morphologically rich Finnish language.

Creative Contextual Dialog Adaptation in an Open World RPG

1 code implementation The 14th International Conference on the Foundations of Digital Games 2019 Mika Hämäläinen, Khalid Alnajjar

Role playing games rely typically on hand-written dialog that has no flexibility in adapting to the game state such as the level of the player.

Word Embeddings

A Creative Dialog Generator for Fallout 4

1 code implementation The 14th International Conference on the Foundations of Digital Games 2019 Khalid Alnajjar, Mika Hämäläinen

This software demonstration describes a mod for Fallout 4 that will adapt in-game dialog to the context of the current state of the game.

Modelling the Socialization of Creative Agents in a Master-Apprentice Setting: The Case of Movie Title Puns

no code implementations10 Jul 2019 Mika Hämäläinen, Khalid Alnajjar

This paper presents work on modelling the social psychological aspect of socialization in the case of a computationally creative master-apprentice system.

NMT

Cannot find the paper you are looking for? You can Submit a new open access paper.