Search Results for author: Idris Abdulmumin

Found 8 papers, 2 papers with code

Hausa Visual Genome: A Dataset for Multi-Modal English to Hausa Machine Translation

no code implementations2 May 2022 Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, Shantipriya Parida, Shamsuddeen Hassan Muhammad, Ibrahim Sa'id Ahmad, Subhadarshi Panda, Ondřej Bojar, Bashir Shehu Galadanci, Bello Shehu Bello

The Hausa Visual Genome is the first dataset of its kind and can be used for Hausa-English machine translation, multi-modal research, and image description, among various other natural language processing and generation tasks.

Machine Translation Natural Language Processing +1

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

1 code implementation20 Jan 2022 Shamsuddeen Hassan Muhammad, David Ifeoluwa Adelani, Sebastian Ruder, Ibrahim Said Ahmad, Idris Abdulmumin, Bello Shehu Bello, Monojit Choudhury, Chris Chinenye Emezue, Saheed Salahudeen Abdullahi, Anuoluwapo Aremu, Alipio Jeorge, Pavel Brazdil

We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria (Hausa, Igbo, Nigerian-Pidgin, and Yor\`ub\'a ) consisting of around 30, 000 annotated tweets per language (and 14, 000 for Nigerian-Pidgin), including a significant fraction of code-mixed tweets.

Sentiment Analysis

A Hybrid Approach for Improved Low Resource Neural Machine Translation using Monolingual Data

no code implementations14 Nov 2020 Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa, Habeebah Adamu Kakudi, Ismaila Idris Sinan

Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model which can reach an acceptable standard of accuracy.

Low-Resource Neural Machine Translation Self-Learning +1

Enhanced back-translation for low resource neural machine translation using self-training

no code implementations4 Jun 2020 Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa

The synthetic data generated by the improved English-German backward model was used to train a forward model which out-performed another forward model trained using standard back-translation by 2. 7 BLEU.

Low-Resource Neural Machine Translation Translation

Tag-less Back-Translation

no code implementations22 Dec 2019 Idris Abdulmumin, Bashir Shehu Galadanci, Aliyu Garba

The standard back-translation method has been shown to be unable to efficiently utilize the available huge amount of existing monolingual data because of the inability of translation models to differentiate between the authentic and synthetic parallel data during training.

Ranked #27 on Machine Translation on IWSLT2014 German-English (using extra training data)

Domain Adaptation Machine Translation +2

Iterative Batch Back-Translation for Neural Machine Translation: A Conceptual Model

no code implementations26 Nov 2019 Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa

An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of back-translations of the target-side monolingual data.

Machine Translation Translation

hauWE: Hausa Words Embedding for Natural Language Processing

no code implementations25 Nov 2019 Idris Abdulmumin, Bashir Shehu Galadanci

This work presents words embedding models using Word2Vec's Continuous Bag of Words (CBoW) and Skip Gram (SG) models.

Machine Translation named-entity-recognition +5

Cannot find the paper you are looking for? You can Submit a new open access paper.