no code implementations • 2 Jan 2024 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed
The findings of this study contribute to advancing NLP research in low-resource settings, enabling greater accessibility and inclusion for African languages in a rapidly expanding digital landscape.
no code implementations • 24 Oct 2023 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, Chiyu Zhang, El Moatez Billah Nagoudi, Houda Bouamor, Nizar Habash
We describe the findings of the fourth Nuanced Arabic Dialect Identification Shared Task (NADI 2023).
no code implementations • 24 Oct 2023 • AbdelRahim Elmadany, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
While many researchers have proposed models and solutions for individual problems, there is an acute shortage of a comprehensive Arabic natural language generation toolkit that is capable of handling a wide range of tasks.
no code implementations • 24 Oct 2023 • Mustafa Jarrar, Muhammad Abdul-Mageed, Mohammed Khalilia, Bashar Talafha, AbdelRahim Elmadany, Nagham Hamad, Alaa' Omar
The winning teams achieved F1 scores of 91. 96 and 93. 73 in FlatNER and NestedNER, respectively.
no code implementations • 17 Oct 2023 • Abdul Waheed, Bashar Talafha, Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We train a wide range of models such as HuBERT (DID), Whisper, and XLS-R (ASR) in a supervised setting for Arabic DID and ASR tasks.
no code implementations • 1 Jun 2023 • Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed
As these pipelines require application of ADI tools to potentially out-of-domain data, we aim to investigate how vulnerable the tools may be to this domain shift.
no code implementations • 24 May 2023 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Ahmed El-Shangiti, Muhammad Abdul-Mageed
We present Dolphin, a novel benchmark that addresses the need for a natural language generation (NLG) evaluation framework dedicated to the wide collection of Arabic languages and varieties.
no code implementations • 21 Apr 2023 • Gagan Bhatia, Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We describe our contribution to the SemEVAl 2023 AfriSenti-SemEval shared task, where we tackle the task of sentiment analysis in 14 different African languages.
no code implementations • 21 Dec 2022 • El Moatez Billah Nagoudi, Muhammad Abdul-Mageed, AbdelRahim Elmadany, Alcides Alcoba Inciarte, Md Tawkat Islam Khondaker
Scholarship on generative pretraining (GPT) remains acutely Anglocentric, leaving serious gaps in our understanding of the whole class of autoregressive models.
no code implementations • 21 Dec 2022 • AbdelRahim Elmadany, El Moatez Billah Nagoudi, Muhammad Abdul-Mageed
Due to their crucial role in all NLP, several benchmarks have been proposed to evaluate pretrained language models.
no code implementations • 21 Dec 2022 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte
Multilingual pretrained language models (mPLMs) acquire valuable, generalizable linguistic information during pretraining and have advanced the state of the art on task-specific finetuning.
1 code implementation • 22 Oct 2022 • Md Tawkat Islam Khondaker, El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan
Contrastive learning (CL) brought significant progress to various NLP tasks.
1 code implementation • 21 Oct 2022 • Ife Adebara, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Alcides Alcoba Inciarte
Problematically, most of the world's 7000+ languages today are not covered by LID technologies.
1 code implementation • 18 Oct 2022 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash
We describe findings of the third Nuanced Arabic Dialect Identification Shared Task (NADI 2022).
1 code implementation • OSACT (LREC) 2022 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
We present TURJUMAN, a neural toolkit for translating from 20 languages into Modern Standard Arabic (MSA).
no code implementations • ACL 2022 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
For evaluation, we introduce a novel benchmark for ARabic language GENeration (ARGEN), covering seven important tasks.
no code implementations • ACL 2021 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi
To evaluate our models, we also introduce ARLUE, a new benchmark for multi-dialectal Arabic language understanding evaluation.
no code implementations • NAACL (CALCS) 2021 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed
Our work is in the context of the Shared Task on Machine Translation in Code-Switching.
1 code implementation • EACL (WANLP) 2021 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Houda Bouamor, Nizar Habash
This Shared Task includes four subtasks: country-level Modern Standard Arabic (MSA) identification (Subtask 1. 1), country-level dialect identification (Subtask 1. 2), province-level MSA identification (Subtask 2. 1), and province-level sub-dialect identification (Subtask 2. 2).
2 code implementations • 27 Dec 2020 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi
To evaluate our models, we also introduce ARLUE, a new benchmark for multi-dialectal Arabic language understanding evaluation.
1 code implementation • EACL (WANLP) 2021 • Muhammad Abdul-Mageed, Shady Elbassuoni, Jad Doughman, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Yorgo Zoughby, Ahmad Shaher, Iskander Gaba, Ahmed Helal, Mohammed El-Razzaz
We describe DiaLex, a benchmark for intrinsic evaluation of dialectal Arabic word embedding.
1 code implementation • COLING (WANLP) 2020 • El Moatez Billah Nagoudi, AbdelRahim Elmadany, Muhammad Abdul-Mageed, Tariq Alhindi, Hasan Cavusoglu
Finally, we develop the first models for detecting manipulated Arabic news and achieve state-of-the-art results on Arabic fake news detection (macro F1=70. 06).
1 code implementation • EMNLP 2020 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Lyle Ungar
Although the prediction of dialects is an important language processing task, with a wide range of applications, existing work is largely limited to coarse-grained varieties.
no code implementations • LREC 2020 • AbdelRahim Elmadany, Chiyu Zhang, Muhammad Abdul-Mageed, Azadeh Hashemi
Social media are pervasive in our life, making it necessary to ensure safe online experiences by detecting and removing offensive and hate speech.
1 code implementation • EACL 2021 • Muhammad Abdul-Mageed, AbdelRahim Elmadany, El Moatez Billah Nagoudi, Dinesh Pabbi, Kunal Verma, Rannie Lin
We describe Mega-COV, a billion-scale dataset from Twitter for studying COVID-19.
no code implementations • 2 Nov 2019 • Muhammad Abdul-Mageed, Chiyu Zhang, Arun Rajendran, AbdelRahim Elmadany, Michael Przystupa, Lyle Ungar
In this work we exploit a newly-created Arabic dataset with ground truth age and gender labels to learn these attributes both individually and in a multi-task setting at the sentence level.
no code implementations • 31 Oct 2019 • Muhammad Abdul-Mageed, Chiyu Zhang, AbdelRahim Elmadany, Arun Rajendran, Lyle Ungar
Prediction of language varieties and dialects is an important language processing task, with a wide range of applications.
no code implementations • WS 2019 • Bushra Algotiml, AbdelRahim Elmadany, Walid Magdy
Speech acts are the actions that a speaker intends when performing an utterance within conversations.
no code implementations • LREC 2018 • AbdelRahim Elmadany, Sherif Abdou, Mervat Gheith
The ability to model and automatically detect dialogue act is an important step toward understanding spontaneous speech and Instant Messages.