Dialect Identification

11 papers with code • 0 benchmarks • 0 datasets

Dialectal Arabic Identification


Evaluating Unsupervised Dutch Word Embeddings as a Linguistic Resource

clips/dutchembeddings LREC 2016

With this research, we provide the embeddings themselves, the relation evaluation task benchmark for use in further research, and demonstrate how the benchmarked embeddings prove a useful unsupervised linguistic resource, effectively used in a downstream task.

Dialect Identification Word Embeddings

Speech Recognition Challenge in the Wild: Arabic MGB-3

qcri/dialectID 21 Sep 2017

Two hours of audio per dialect were released for development and a further two hours were used for evaluation.

Dialect Identification Speech Recognition

Automatic Dialect Detection in Arabic Broadcast Speech

Qatar-Computing-Research-Institute/dialectID 23 Sep 2015

We used these features in a binary classifier to discriminate between Modern Standard Arabic (MSA) and Dialectal Arabic, with an accuracy of 100%.

Dialect Identification Speech Recognition +1

Multi-Dialect Arabic BERT for Country-Level Dialect Identification

mawdoo3/Multi-dialect-Arabic-BERT 10 Jul 2020

Our winning solution itself came in the form of an ensemble of different training iterations of our pre-trained BERT model, which achieved a micro-averaged F1-score of 26. 78% on the subtask at hand.

Dialect Identification Language Modelling

A Character-level Convolutional Neural Network for Distinguishing Similar Languages and Dialects

boknilev/dsl-char-cnn WS 2016

Discriminating between closely-related language varieties is considered a challenging and important task.

Dialect Identification

AraCOVID19-MFH: Arabic COVID-19 Multi-label Fake News and Hate Speech Detection Dataset

maroxtn/tun-sentiment 7 May 2021

This paper releases "AraCOVID19-MFH" a manually annotated multi-label Arabic COVID-19 fake news and hate speech detection dataset.

Dialect Identification Fact Checking +3

NADI 2021: The Second Nuanced Arabic Dialect Identification Shared Task

UBC-NLP/nadi 4 Mar 2021

This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).

Dialect Identification

Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments

UBC-NLP/microdialects EMNLP 2020

Although the prediction of dialects is an important language processing task, with a wide range of applications, existing work is largely limited to coarse-grained varieties.

Dialect Identification Language Modelling +1

Adapting MARBERT for Improved Arabic Dialect Identification: Submission to the NADI 2021 Shared Task

mohamedgabr96/NeuralDialectDetector 1 Mar 2021

Tasks are to identify the geographic origin of short Dialectal (DA) and Modern Standard Arabic (MSA) utterances at the levels of both country and province.

Dialect Identification