Transcribed speech and user-generated text in Arabic typically contain a mixture of Modern Standard Arabic (MSA), the standardized language taught in schools, and Dialectal Arabic (DA), used in daily communications.

20 Oct 2023

Paper
Code

RoDia: A New Dataset for Romanian Dialect Identification from Speech

codrut2/rodia • 6 Sep 2023

We introduce RoDia, the first dataset for Romanian dialect identification from speech.

06 Sep 2023

Paper
Code

DADA: Dialect Adaptation via Dynamic Aggregation of Linguistic Rules

salt-nlp/dada • • 22 May 2023

We show that DADA is effective for both single task and instruction finetuned language models, offering an extensible and interpretable framework for adapting existing LLMs to different English dialects.

22 May 2023

Paper
Code

North Sámi Dialect Identification with Self-supervised Speech Models

skakouros/sami_dialects • • 19 May 2023

The North S\'{a}mi (NS) language encapsulates four primary dialectal variants that are related but that also have differences in their phonology, morphology, and vocabulary.

19 May 2023

Paper
Code

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

srijith-rkr/kaust-whisper-adapter • • 18 May 2023

In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI).

18 May 2023

Paper
Code

Two-stage Pipeline for Multilingual Dialect Detection

ankit-vaidya19/eacl_vardial2023 • • 6 Mar 2023

Our proposed approach consists of a two-stage system and outperforms other participants' systems and previous works in this domain.

06 Mar 2023

Paper
Code

Dialect Identification

Benchmarks Add a Result

Datasets

Latest papers

Content

Benchmarks

Add a Result