Dialect Identification
32 papers with code • 0 benchmarks • 3 datasets
Dialectal Arabic Identification
Benchmarks
These leaderboards are used to track progress in Dialect Identification
Latest papers with no code
USTHB at NADI 2023 shared task: Exploring Preprocessing and Feature Engineering Strategies for Arabic Dialect Identification
In this paper, we conduct an in-depth analysis of several key factors influencing the performance of Arabic Dialect Identification NADI'2023, with a specific focus on the first subtask involving country-level dialect identification.
Self-supervised Adaptive Pre-training of Multilingual Speech Models for Language and Dialect Identification
To address this challenge, we propose self-supervised adaptive pre-training (SAPT) to adapt the pre-trained model to the target domain and languages of the downstream task.
Mavericks at NADI 2023 Shared Task: Unravelling Regional Nuances through Dialect Identification using Transformer-based Approach
We fine-tune these state-of-the-art models on the provided dataset.
NADI 2023: The Fourth Nuanced Arabic Dialect Identification Shared Task
We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023).
Yet Another Model for Arabic Dialect Identification
We explore two architectural variations: ResNet and ECAPA-TDNN, coupled with two types of acoustic features: MFCCs and features exratected from the pre-trained self-supervised model UniSpeech-SAT Large, as well as a fusion of all four variants.
VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
We train a wide range of models such as HuBERT (DID), Whisper, and XLS-R (ASR) in a supervised setting for Arabic DID and ASR tasks.
Advanced accent/dialect identification and accentedness assessment with multi-embedding models and automatic speech recognition
In this study, embeddings from advanced pre-trained language identification (LID) and speaker identification (SID) models are leveraged to improve the accuracy of accent classification and non-native accentedness assessment.
Unsupervised Out-of-Distribution Dialect Detection with Mahalanobis Distance
Dialect classification is used in a variety of applications, such as machine translation and speech recognition, to improve the overall performance of the system.
Towards spoken dialect identification of Irish
Motivated by this, the present experiments investigate spoken dialect identification of Irish, with a view to incorporating such a system into the speech recognition pipeline.
On the Robustness of Arabic Speech Dialect Identification
As these pipelines require application of ADI tools to potentially out-of-domain data, we aim to investigate how vulnerable the tools may be to this domain shift.