Natural Language Processing

Dialect Identification

32 papers with code • 0 benchmarks • 3 datasets

Dialectal Arabic Identification

Benchmarks

Add a Result

These leaderboards are used to track progress in Dialect Identification

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Latest papers with no code

Most implemented Social Latest No code

USTHB at NADI 2023 shared task: Exploring Preprocessing and Feature Engineering Strategies for Arabic Dialect Identification

no code yet • 16 Dec 2023

In this paper, we conduct an in-depth analysis of several key factors influencing the performance of Arabic Dialect Identification NADI'2023, with a specific focus on the first subtask involving country-level dialect identification.

Paper
Add Code

Self-supervised Adaptive Pre-training of Multilingual Speech Models for Language and Dialect Identification

no code yet • 12 Dec 2023

To address this challenge, we propose self-supervised adaptive pre-training (SAPT) to adapt the pre-trained model to the target domain and languages of the downstream task.

Paper
Add Code

Mavericks at NADI 2023 Shared Task: Unravelling Regional Nuances through Dialect Identification using Transformer-based Approach

no code yet • 30 Nov 2023

We fine-tune these state-of-the-art models on the provided dataset.

Paper
Add Code

NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task

no code yet • 24 Oct 2023

We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023).

Paper
Add Code

Yet Another Model for Arabic Dialect Identification

no code yet • 20 Oct 2023

We explore two architectural variations: ResNet and ECAPA-TDNN, coupled with two types of acoustic features: MFCCs and features exratected from the pre-trained self-supervised model UniSpeech-SAT Large, as well as a fusion of all four variants.

Paper
Add Code

VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System

no code yet • 17 Oct 2023

We train a wide range of models such as HuBERT (DID), Whisper, and XLS-R (ASR) in a supervised setting for Arabic DID and ASR tasks.

Paper
Add Code

Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition

no code yet • 17 Oct 2023

In this study, embeddings from advanced pre-trained language identification (LID) and speaker identification (SID) models are leveraged to improve the accuracy of accent classification and non-native accentedness assessment.

Paper
Add Code

Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance

no code yet • 9 Aug 2023

Dialect classification is used in a variety of applications, such as machine translation and speech recognition, to improve the overall performance of the system.

Paper
Add Code

Towards spoken dialect identification of Irish

no code yet • 14 Jul 2023

Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline.

Paper
Add Code

On the Robustness of Arabic Speech Dialect Identification

no code yet • 1 Jun 2023

As these pipelines require application of ADI tools to potentially out-of-domain data, we aim to investigate how vulnerable the tools may be to this domain shift.

Paper
Add Code

Dialect Identification

Benchmarks Add a Result

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result