Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data

mainlp/barner 19 Mar 2024

Named Entity Recognition (NER) is a fundamental task to extract key information from texts, but annotated resources are scarce for dialects.

0
19 Mar 2024

ArTST: Arabic Text and Speech Transformer

mbzuai-nlp/artst 25 Oct 2023

We present ArTST, a pre-trained Arabic text and speech transformer for supporting open-source speech technologies for the Arabic language.

16
25 Oct 2023

GlotLID: Language Identification for Low-Resource Languages

cisnlp/glotlid 24 Oct 2023

Several recent papers have published good solutions for language identification (LID) for about 300 high-resource and medium-resource languages.

67
24 Oct 2023

Arabic Dialect Identification under Scrutiny: Limitations of Single-label Classification

amr-keleg/adi-under-scrutiny 20 Oct 2023

Automatic Arabic Dialect Identification (ADI) of text has gained great popularity since it was introduced in the early 2010s.

3
20 Oct 2023

ALDi: Quantifying the Arabic Level of Dialectness of Text

amr-keleg/aldi 20 Oct 2023

Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications.

2
20 Oct 2023

RoDia: A New Dataset for Romanian Dialect Identification from Speech

codrut2/rodia 6 Sep 2023

We introduce RoDia, the first dataset for Romanian dialect identification from speech.

2
06 Sep 2023

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

salt-nlp/dada 22 May 2023

We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.

6
22 May 2023

North Sámi Dialect Identification with Self-supervised Speech Models

skakouros/sami_dialects 19 May 2023

The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary.

3
19 May 2023

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

srijith-rkr/kaust-whisper-adapter 18 May 2023

In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI).

26
18 May 2023

Two-stage Pipeline for Multilingual Dialect Detection

ankit-vaidya19/eacl_vardial2023 6 Mar 2023

Our proposed approach consists of a two-stage system and outperforms other participants' systems and previous works in this domain.

0
06 Mar 2023