no code implementations • LREC 2022 • Idris Abdulmumin, Satya Ranjan Dash, Musa Abdullahi Dawud, Shantipriya Parida, Shamsuddeen Hassan Muhammad, Ibrahim Sa'id Ahmad, Subhadarshi Panda, Ondřej Bojar, Bashir Shehu Galadanci, Bello Shehu Bello
The Hausa Visual Genome is the first dataset of its kind and can be used for Hausa-English machine translation, multi-modal research, and image description, among various other natural language processing and generation tasks.
no code implementations • 14 Nov 2020 • Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa, Habeebah Adamu Kakudi, Ismaila Idris Sinan
Many language pairs are low resource, meaning the amount and/or quality of available parallel data is not sufficient to train a neural machine translation (NMT) model which can reach an acceptable standard of accuracy.
no code implementations • 4 Jun 2020 • Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa
The synthetic data generated by the improved English-German backward model was used to train a forward model which out-performed another forward model trained using standard back-translation by 2. 7 BLEU.
no code implementations • 22 Dec 2019 • Idris Abdulmumin, Bashir Shehu Galadanci, Aliyu Garba
The standard back-translation method has been shown to be unable to efficiently utilize the available huge amount of existing monolingual data because of the inability of translation models to differentiate between the authentic and synthetic parallel data during training.
Ranked #32 on Machine Translation on IWSLT2014 German-English (using extra training data)
no code implementations • 26 Nov 2019 • Idris Abdulmumin, Bashir Shehu Galadanci, Abubakar Isa
An effective method to generate a large number of parallel sentences for training improved neural machine translation (NMT) systems is the use of back-translations of the target-side monolingual data.
no code implementations • 25 Nov 2019 • Idris Abdulmumin, Bashir Shehu Galadanci
This work presents words embedding models using Word2Vec's Continuous Bag of Words (CBoW) and Skip Gram (SG) models.