Search Results for author: Theodorus Fransen

Found 9 papers, 2 papers with code

Cross-lingual Sentence Embedding using Multi-Task Learning

no code implementations • EMNLP 2021 • Koustava Goswami, Sourav Dutta, Haytham Assem, Theodorus Fransen, John P. McCrae

We demonstrate the efficacy of an unsupervised as well as a weakly supervised variant of our framework on STS, BUCC and Tatoeba benchmark tasks.

Multi-Task Learning Semantic Similarity +6

Paper
Add Code

MHE: Code-Mixed Corpora for Similar Language Identification

no code implementations • LREC 2022 • Priya Rani, John P. McCrae, Theodorus Fransen

This data-set is the first Magahi-Hindi-English code-mixed data-set for similar language identification task.

Language Identification Sentence

Paper
Add Code

MaCmS: Magahi Code-mixed Dataset for Sentiment Analysis

no code implementations • 7 Mar 2024 • Priya Rani, Gaurav Negi, Theodorus Fransen, John P. McCrae

The present paper introduces new sentiment data, MaCMS, for Magahi-Hindi-English (MHE) code-mixed language, where Magahi is a less-resourced minority language.

Sentiment Analysis

Paper
Add Code

Weakly-supervised Deep Cognate Detection Framework for Low-Resourced Languages Using Morphological Knowledge of Closely-Related Languages

1 code implementation • 9 Nov 2023 • Koustava Goswami, Priya Rani, Theodorus Fransen, John P. McCrae

We train an encoder to gain morphological knowledge of a language and transfer the knowledge to perform unsupervised and weakly-supervised cognate detection tasks with and without the pivot language for the closely-related languages.

Information Retrieval named-entity-recognition +3

Paper
Code

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages

no code implementations • MTSummit 2021 • Atul Kr. Ojha, Chao-Hong Liu, Katharina Kann, John Ortega, Sheetal Shatam, Theodorus Fransen

Maximum system performance was computed using BLEU and follow as 36. 0 for English--Irish, 34. 6 for Irish--English, 24. 2 for English--Marathi, and 31. 3 for Marathi--English.

Machine Translation Translation

Paper
Add Code

Unsupervised Deep Language and Dialect Identification for Short Texts

no code implementations • COLING 2020 • Koustava Goswami, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae

Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines.

Dialect Identification Sentence +1

Paper
Add Code

ULD@NUIG at SemEval-2020 Task 9: Generative Morphemes with an Attention Model for Sentiment Analysis in Code-Mixed Text

no code implementations • SEMEVAL 2020 • Koustava Goswami, Priya Rani, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae

Code mixing is a common phenomena in multilingual societies where people switch from one language to another for various reasons.

Sentiment Analysis

Paper
Add Code

A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment

1 code implementation • LREC 2020 • Sina Ahmadi, John Philip McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette Pedersen, Thierry Declerck, Tanja Wissik, Bell, Andrea i, Irene Pisani, Thomas Troelsg{\aa}rd, Sussi Olsen, Simon Krek, Veronika Lipp, Tam{\'a}s V{\'a}radi, L{\'a}szl{\'o} Simon, Andr{\'a}s Gyorffy, Carole Tiberius, Tanneke Schoonheim, Yifat Ben Moshe, Maya Rudich, Raya Abu Ahmad, Dorielle Lonke, Kira Kovalenko, Margit Langemets, Jelena Kallas, Oksana Dereza, Theodorus Fransen, David Cillessen, David Lindemann, Mikel Alonso, Ana Salgado, Jos{\'e} Luis Sancho, Rafael-J. Ure{\~n}a-Ruiz, Jordi Porta Zamorano, Kiril Simov, Petya Osenova, Zara Kancheva, Ivaylo Radev, Ranka Stankovi{\'c}, Andrej Perdih, Dejan Gabrovsek

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography.

Paper
Code

A Comparative Study of Different State-of-the-Art Hate Speech Detection Methods in Hindi-English Code-Mixed Data

no code implementations • LREC 2020 • Priya Rani, Shardul Suryawanshi, Koustava Goswami, Bharathi Raja Chakravarthi, Theodorus Fransen, John Philip McCrae

Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities.

Hate Speech Detection

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.