Natural Language Processing

Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Benchmarks

Add a Result

These leaderboards are used to track progress in Dialect Identification

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Most implemented papers

Most implemented Social Latest No code

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

mohamedgabr96/NeuralDialectDetector • • EACL (WANLP) 2021

Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province.

Paper
Code

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

UBC-NLP/nadi • EACL (WANLP) 2021

This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).

Paper
Code

Dynamic Multi-scale Convolution for Dialect Identification

yuyq96/d-tdnn • • 2 Aug 2021

To address this issue, we propose a new architecture, named dynamic multi-scale convolution, which consists of dynamic kernel convolution, local multi-scale learning, and global multi-scale pooling.

Paper
Code

Finnish Dialect Identification: The Effect of Audio and Text

rootroo-ltd/finnishdialectidentification • EMNLP 2021

Finnish is a language with multiple dialects that not only differ from each other in terms of accent (pronunciation) but also in terms of morphological forms and lexical choice.

Paper
Code

Distilling the Knowledge of Romanian BERTs Using Multiple Teachers

racai-ai/romanian-distilbert • LREC 2022

In this work, we introduce three light and fast versions of distilled BERT models for the Romanian language: Distil-BERT-base-ro, Distil-RoBERT-base, and DistilMulti-BERT-base-ro.

Paper
Code

NADI 2022: The Third Nuanced Arabic Dialect Identification Shared Task

UBC-NLP/nadi • 18 Oct 2022

We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022).

Paper
Code

A Benchmark Study of Contrastive Learning for Arabic Social Meaning

Tawkat/Arabic-CL-Benchmark • • 22 Oct 2022

Contrastive learning (CL) brought significant progress to various NLP tasks.

Paper
Code

FreCDo: A Large Corpus for French Cross-Domain Dialect Identification

mihaelagaman/frecdo • • 15 Dec 2022

We present a novel corpus for French dialect identification comprising 413, 522 French text samples collected from public news websites in Belgium, Canada, France and Switzerland.

Paper
Code

Two-stage Pipeline for Multilingual Dialect Detection

ankit-vaidya19/eacl_vardial2023 • • 6 Mar 2023

Our proposed approach consists of a two-stage system and outperforms other participants' systems and previous works in this domain.

Paper
Code

A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model

srijith-rkr/kaust-whisper-adapter • • 18 May 2023

In this work, we explore Parameter-Efficient-Learning (PEL) techniques to repurpose a General-Purpose-Speech (GSM) model for Arabic dialect identification (ADI).

Paper
Code

Dialect Identification

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result