Search Results for author: Sina Ahmadi

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography.

Paper
Code

Challenges of Word Sense Alignment: Portuguese Language Resources

no code implementations • LREC 2020 • Ana Salgado, Sina Ahmadi, Alberto Sim{\~o}es, John Philip McCrae, Rute Costa

Word sense alignment involves searching for matching senses within dictionary entries of different lexical resources and linking them, which poses significant challenges.

Paper
Add Code

A Corpus of the Sorani Kurdish Folkloric Lyrics

1 code implementation • LREC 2020 • Sina Ahmadi, Hossein Hassani, Kamaladdin Abedi

We believe that this corpus contributes to Kurdish language processing in several ways, such as compensation for the lack of a long history of written text by incorporating oral literature, presenting an unexplored realm in Kurdish language processing, and assisting the initiation of Kurdish computational folkloristics.

Attribute

Paper
Code

Towards Finite-State Morphology of Kurdish

no code implementations • 21 May 2020 • Sina Ahmadi, Hossein Hassani

Morphological analysis is the study of the formation and structure of words.

Information Retrieval Machine Translation +4

Paper
Add Code

Leveraging Multilingual News Websites for Building a Kurdish Parallel Corpus

1 code implementation • 4 Oct 2020 • Sina Ahmadi, Hossein Hassani, Daban Q. Jaff

We present a corpus containing 12, 327 translation pairs in the two major dialects of Kurdish, Sorani and Kurmanji.

Translation Transliteration

Paper
Code

Towards Machine Translation for the Kurdish Language

1 code implementation • loresmt (AACL) 2020 • Sina Ahmadi, Mariam Masoud

Machine translation is the task of translating texts from one language to another using computers.

Machine Translation Translation

Paper
Code

A Formal Description of Sorani Kurdish Morphology

no code implementations • 8 Sep 2021 • Sina Ahmadi

Sorani Kurdish, also known as Central Kurdish, has a complex morphology, particularly due to the patterns in which morphemes appear.

Morphological Analysis

Paper
Add Code

Hunspell for Sorani Kurdish Spell Checking and Morphological Analysis

1 code implementation • 14 Sep 2021 • Sina Ahmadi

Spell checking and morphological analysis are two fundamental tasks in text and natural language processing and are addressed in the early stages of the development of language technology.

Morphological Analysis

Paper
Code

Monolingual alignment of word senses and definitions in lexicographical resources

no code implementations • 6 Sep 2022 • Sina Ahmadi

This is a challenging task, especially due to differences in sense granularity, coverage and description in two resources.

Paper
Add Code

PALI: A Language Identification Benchmark for Perso-Arabic Scripts

1 code implementation • 3 Apr 2023 • Sina Ahmadi, Milind Agarwal, Antonios Anastasopoulos

The Perso-Arabic scripts are a family of scripts that are widely adopted and used by various linguistic communities around the globe.

Language Identification

Paper
Code

Approaches to Corpus Creation for Low-Resource Language Technology: the Case of Southern Kurdish and Laki

1 code implementation • 3 Apr 2023 • Sina Ahmadi, Zahra Azin, Sara Belelli, Antonios Anastasopoulos

One of the major challenges that under-represented and endangered language communities face in language technology is the lack or paucity of language data.

Language Identification

Paper
Code

Transfer Learning for Low-Resource Sentiment Analysis

1 code implementation • 10 Apr 2023 • Razhan Hameed, Sina Ahmadi, Fatemeh Daneshfar

Sentiment analysis is the process of identifying and extracting subjective information from text.

Data Augmentation Sentiment Analysis +1

Paper
Code

Script Normalization for Unconventional Writing of Under-Resourced Languages in Bilingual Communities

1 code implementation • 25 May 2023 • Sina Ahmadi, Antonios Anastasopoulos

The wide accessibility of social media has provided linguistically under-represented communities with an extraordinary opportunity to create content in their native languages.

Language Identification Machine Translation

Paper
Code

CODET: A Benchmark for Contrastive Dialectal Evaluation of Machine Translation

no code implementations • 26 May 2023 • Md Mahfuz ibn Alam, Sina Ahmadi, Antonios Anastasopoulos

Neural machine translation (NMT) systems exhibit limited robustness in handling source-side linguistic variations.

Machine Translation NMT +1

Paper
Add Code

A Morphologically-Aware Dictionary-based Data Augmentation Technique for Machine Translation of Under-Represented Languages

no code implementations • 2 Feb 2024 • Md Mahfuz ibn Alam, Sina Ahmadi, Antonios Anastasopoulos

In this paper, we propose strategies to synthesize parallel data relying on morpho-syntactic information and using bilingual lexicons along with a small amount of seed parallel data.

Data Augmentation Machine Translation

Paper
Add Code

Language and Speech Technology for Central Kurdish Varieties

1 code implementation • 4 Mar 2024 • Sina Ahmadi, Daban Q. Jaff, Md Mahfuz ibn Alam, Antonios Anastasopoulos

Kurdish, an Indo-European language spoken by over 30 million speakers, is considered a dialect continuum and known for its diversity in language varieties.

Automatic Speech Recognition Language Identification +3

Paper
Code

A Tokenization System for the Kurdish Language

1 code implementation • VarDial (COLING) 2020 • Sina Ahmadi

We demonstrate how the morphological complexity of the language along with the lack of a unified orthography can be efficiently addressed in tokenization.

Paper
Code

Building a Corpus for the Zaza–Gorani Language Family

1 code implementation • VarDial (COLING) 2020 • Sina Ahmadi

The Zaza–Gorani language family is a linguistic subgroup of the Northwestern Iranian languages for which there is no significant corpus available.

Paper
Code

KLPT – Kurdish Language Processing Toolkit

1 code implementation • EMNLP (NLPOSS) 2020 • Sina Ahmadi

Despite the recent advances in applying language-independent approaches to various natural language processing tasks thanks to artificial intelligence, some language-specific tools are still essential to process a language in a viable manner.

Lemmatization Transliteration

Paper
Code

Monolingual Word Sense Alignment as a Classification Problem

no code implementations • EACL (GWC) 2021 • Sina Ahmadi, John P. McCrae

Words are defined based on their meanings in various ways in different resources.

Classification Relationship Detection +1

Paper
Add Code

Cross-Lingual Link Discovery for Under-Resourced Languages

no code implementations • LREC 2022 • Michael Rosner, Sina Ahmadi, Elena-Simona Apostol, Julia Bosque-Gil, Christian Chiarcos, Milan Dojchinovski, Katerina Gkirtzou, Jorge Gracia, Dagmar Gromann, Chaya Liebeskind, Giedrė Valūnaitė Oleškevičienė, Gilles Sérasset, Ciprian-Octavian Truică

In this paper, we provide an overview of current technologies for cross-lingual link discovery, and we discuss challenges, experiences and prospects of their application to under-resourced languages.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.