Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Most implemented papers

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

mohamedgabr96/NeuralDialectDetector EACL (WANLP) 2021

Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province.

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

UBC-NLP/nadi EACL (WANLP) 2021

This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).

Dynamic Multi-scale Convolution for Dialect Identification

yuyq96/d-tdnn 2 Aug 2021

To address this issue, we propose a new architecture, named dynamic multi-scale convolution, which consists of dynamic kernel convolution, local multi-scale learning, and global multi-scale pooling.

Finnish Dialect Identification: The Effect of Audio and Text

rootroo-ltd/finnishdialectidentification EMNLP 2021

Finnish is a language with multiple dialects that not only differ from each other in terms of accent (pronunciation) but also in terms of morphological forms and lexical choice.

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers

racai-ai/romanian-distilbert LREC 2022

In this work, we introduce three light and fast versions of distilled BERT models for the Romanian language: Distil-BERT-base-ro, Distil-RoBERT-base, and DistilMulti-BERT-base-ro.

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

UBC-NLP/nadi 18 Oct 2022

We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022).

A Benchmark Study of Contrastive Learning for Arabic Social Meaning

Tawkat/Arabic-CL-Benchmark 22 Oct 2022

Contrastive learning (CL) brought significant progress to various NLP tasks.

FreCDo: A Large Corpus for French Cross-Domain Dialect Identification

mihaelagaman/frecdo 15 Dec 2022

We present a novel corpus for French dialect identification comprising 413, 522 French text samples collected from public news websites in Belgium, Canada, France and Switzerland.

Two-stage Pipeline for Multilingual Dialect Detection

ankit-vaidya19/eacl_vardial2023 6 Mar 2023

Our proposed approach consists of a two-stage system and outperforms other participants' systems and previous works in this domain.

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

srijith-rkr/kaust-whisper-adapter 18 May 2023

In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI).