Search Results for author: Theodorus Fransen

Found 9 papers, 2 papers with code

Cross-lingual Sentence Embedding using Multi-Task Learning

no code implementations EMNLP 2021 Koustava Goswami, Sourav Dutta, Haytham Assem, Theodorus Fransen, John P. McCrae

We demonstrate the efficacy of an unsupervised as well as a weakly supervised variant of our framework on STS, BUCC and Tatoeba benchmark tasks.

Multi-Task Learning Semantic Similarity +6

MaCmS: Magahi Code-mixed Dataset for Sentiment Analysis

no code implementations7 Mar 2024 Priya Rani, Gaurav Negi, Theodorus Fransen, John P. McCrae

The present paper introduces new sentiment data, MaCMS, for Magahi-Hindi-English (MHE) code-mixed language, where Magahi is a less-resourced minority language.

Sentiment Analysis

Weakly-supervised Deep Cognate Detection Framework for Low-Resourced Languages Using Morphological Knowledge of Closely-Related Languages

1 code implementation9 Nov 2023 Koustava Goswami, Priya Rani, Theodorus Fransen, John P. McCrae

We train an encoder to gain morphological knowledge of a language and transfer the knowledge to perform unsupervised and weakly-supervised cognate detection tasks with and without the pivot language for the closely-related languages.

Information Retrieval named-entity-recognition +3

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages

no code implementations MTSummit 2021 Atul Kr. Ojha, Chao-Hong Liu, Katharina Kann, John Ortega, Sheetal Shatam, Theodorus Fransen

Maximum system performance was computed using BLEU and follow as 36. 0 for English--Irish, 34. 6 for Irish--English, 24. 2 for English--Marathi, and 31. 3 for Marathi--English.

Machine Translation Translation

Unsupervised Deep Language and Dialect Identification for Short Texts

no code implementations COLING 2020 Koustava Goswami, Rajdeep Sarkar, Bharathi Raja Chakravarthi, Theodorus Fransen, John P. McCrae

Automatic Language Identification (LI) or Dialect Identification (DI) of short texts of closely related languages or dialects, is one of the primary steps in many natural language processing pipelines.

Dialect Identification Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.