Search Results for author: Bonaventure F. P. Dossou

Found 29 papers, 17 papers with code

CodeUnlearn: Amortized Zero-Shot Machine Unlearning in Language Models Using Discrete Concept

no code implementations8 Oct 2024 Yuxuan Wu, Bonaventure F. P. Dossou, Dianbo Liu

Large Language Models (LLMs) offer extensive knowledge across various domains, but they may inadvertently memorize sensitive, unauthorized, or malicious data, such as personal information in the medical and financial sectors.

Machine Unlearning

InkubaLM: A small language model for low-resource African languages

no code implementations30 Aug 2024 Atnafu Lambebo Tonja, Bonaventure F. P. Dossou, Jessica Ojo, Jenalea Rajab, Fadel Thior, Eric Peter Wairagala, Anuoluwapo Aremu, Pelonomi Moiloa, Jade Abbott, Vukosi Marivate, Benjamin Rosman

High-resource language models often fall short in the African context, where there is a critical need for models that are efficient, accessible, and locally relevant, even amidst significant computing and data constraints.

Language Modelling Machine Translation +2

A Study of Acquisition Functions for Medical Imaging Deep Active Learning

1 code implementation28 Jan 2024 Bonaventure F. P. Dossou

In this work, we show how active learning could be very effective in data scarcity situations, where obtaining labeled data (or annotation budget is very limited).

Active Learning Breast Cancer Detection +1

FonMTL: Towards Multitask Learning for the Fon Language

1 code implementation28 Aug 2023 Bonaventure F. P. Dossou, Iffanice Houndayi, Pamely Zantou, Gilles Hacheme

Multitask learning is a learning paradigm that aims to improve the generalization capacity of a model by sharing knowledge across different but related tasks: this could be prevalent in very data-scarce scenarios.

Language Modelling named-entity-recognition +4

Advancing African-Accented Speech Recognition: Epistemic Uncertainty-Driven Data Selection for Generalizable ASR Models

1 code implementation3 Jun 2023 Bonaventure F. P. Dossou

Combining several active learning paradigms and the core-set approach, we propose a new multi-rounds adaptation process that uses epistemic uncertainty to automate the annotation process, significantly reducing the associated costs and human labor.

Accented Speech Recognition Active Learning +4

AfriNames: Most ASR models "butcher" African Names

no code implementations1 Jun 2023 Tobi Olatunji, Tejumade Afonja, Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Chris Chinenye Emezue, Amina Mardiyyah Rufai, Sahib Singh

Useful conversational agents must accurately capture named entities to minimize error for downstream tasks, for example, asking a voice assistant to play a track from a certain artist, initiating navigation to a specific location, or documenting a laboratory result for a patient.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

MasakhaNEWS: News Topic Classification for African languages

1 code implementation19 Apr 2023 David Ifeoluwa Adelani, Marek Masiak, Israel Abebe Azime, Jesujoba Alabi, Atnafu Lambebo Tonja, Christine Mwase, Odunayo Ogundepo, Bonaventure F. P. Dossou, Akintunde Oladipo, Doreen Nixdorf, Chris Chinenye Emezue, sana al-azzawi, Blessing Sibanda, Davis David, Lolwethu Ndolela, Jonathan Mukiibi, Tunde Ajayi, Tatiana Moteu, Brian Odhiambo, Abraham Owodunni, Nnaemeka Obiefuna, Muhidin Mohamed, Shamsuddeen Hassan Muhammad, Teshome Mulugeta Ababu, Saheed Abdullahi Salahudeen, Mesay Gemeda Yigezu, Tajuddeen Gwadabe, Idris Abdulmumin, Mahlet Taye, Oluwabusayo Awoyomi, Iyanuoluwa Shode, Tolulope Adelani, Habiba Abdulganiyu, Abdul-Hakeem Omotayo, Adetola Adeeko, Abeeb Afolabi, Anuoluwapo Aremu, Olanrewaju Samuel, Clemencia Siro, Wangari Kimotho, Onyekachi Ogbu, Chinedu Mbonu, Chiamaka Chukwuneke, Samuel Fanijo, Jessica Ojo, Oyinkansola Awosan, Tadesse Kebede, Toadoum Sari Sakayo, Pamela Nyatsine, Freedmore Sidume, Oreen Yousuf, Mardiyyah Oduwole, Tshinu Tshinu, Ussen Kimanuka, Thina Diko, Siyanda Nxakama, Sinodos Nigusse, Abdulmejid Johar, Shafie Mohamed, Fuad Mire Hassan, Moges Ahmed Mehamed, Evrard Ngabire, Jules Jules, Ivan Ssenkungu, Pontus Stenetorp

Furthermore, we explore several alternatives to full fine-tuning of language models that are better suited for zero-shot and few-shot learning such as cross-lingual parameter-efficient fine-tuning (like MAD-X), pattern exploiting training (PET), prompting language models (like ChatGPT), and prompt-free sentence transformer fine-tuning (SetFit and Cohere Embedding API).

Classification Few-Shot Learning +7

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages

1 code implementation7 Nov 2022 Bonaventure F. P. Dossou, Atnafu Lambebo Tonja, Oreen Yousuf, Salomey Osei, Abigail Oppong, Iyanuoluwa Shode, Oluwabusayo Olufunke Awoyomi, Chris Chinenye Emezue

In this paper, we present AfroLM, a multilingual language model pretrained from scratch on 23 African languages (the largest effort to date) using our novel self-active learning framework.

Active Learning Language Modelling +4

Graph-Based Active Machine Learning Method for Diverse and Novel Antimicrobial Peptides Generation and Selection

no code implementations18 Sep 2022 Bonaventure F. P. Dossou, Dianbo Liu, Xu Ji, Moksh Jain, Almer M. van der Sloot, Roger Palou, Michael Tyers, Yoshua Bengio

As antibiotic-resistant bacterial strains are rapidly spreading worldwide, infections caused by these strains are emerging as a global crisis causing the death of millions of people every year.

Diversity

MMTAfrica: Multilingual Machine Translation for African Languages

1 code implementation WMT (EMNLP) 2021 Chris C. Emezue, Bonaventure F. P. Dossou

In this paper, we focus on the task of multilingual machine translation for African languages and describe our contribution in the 2021 WMT Shared Task: Large-Scale Multilingual Machine Translation.

Machine Translation Translation

Biological Sequence Design with GFlowNets

1 code implementation2 Mar 2022 Moksh Jain, Emmanuel Bengio, Alex-Hernandez Garcia, Jarrid Rector-Brooks, Bonaventure F. P. Dossou, Chanakya Ekbote, Jie Fu, Tianyu Zhang, Micheal Kilgour, Dinghuai Zhang, Lena Simine, Payel Das, Yoshua Bengio

In this work, we propose an active learning algorithm leveraging epistemic uncertainty estimation and the recently proposed GFlowNets as a generator of diverse candidate solutions, with the objective to obtain a diverse batch of useful (as defined by some utility function, for example, the predicted anti-microbial activity of a peptide) and informative candidates after each round.

Active Learning Diversity

FSER: Deep Convolutional Neural Networks for Speech Emotion Recognition

no code implementations15 Sep 2021 Bonaventure F. P. Dossou, Yeno K. S. Gbenou

Using mel-spectrograms over conventional MFCCs features, we assess the abilities of convolutional neural networks to accurately recognize and classify emotions from speech data.

Speech Emotion Recognition valid

MasakhaNER: Named Entity Recognition for African Languages

2 code implementations22 Mar 2021 David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

Crowdsourced Phrase-Based Tokenization for Low-Resourced Neural Machine Translation: The Case of Fon Language

no code implementations14 Mar 2021 Bonaventure F. P. Dossou, Chris C. Emezue

Building effective neural machine translation (NMT) models for very low-resourced and morphologically rich African indigenous languages is an open challenge.

Machine Translation NMT +1

OkwuGbé: End-to-End Speech Recognition for Fon and Igbo

4 code implementations13 Mar 2021 Bonaventure F. P. Dossou, Chris C. Emezue

Our linguistic analyses (for Fon and Igbo) provide valuable insights and guidance into the creation of speech recognition models for other African low-resourced languages, as well as guide future NLP research for Fon and Igbo.

Machine Translation speech-recognition +1

AfriVEC: Word Embedding Models for African Languages. Case Study of Fon and Nobiin

1 code implementation8 Mar 2021 Bonaventure F. P. Dossou, Mohammed Sabry

From Word2Vec to GloVe, word embedding models have played key roles in the current state-of-the-art results achieved in Natural Language Processing.

Transfer Learning

Crowd-sourced Phrase-Based Tokenization for Low-Resourced Neural Machine Translation: The case of Fon Language

no code implementations1 Jan 2021 Bonaventure F. P. Dossou, Chris Chinenye Emezue

Building effective neural machine translation (NMT) models for very low-resourced and morphologically rich African indigenous languages is an open challenge.

Machine Translation NMT +1

Lanfrica: A Participatory Approach to Documenting Machine Translation Research on African Languages

no code implementations3 Aug 2020 Chris C. Emezue, Bonaventure F. P. Dossou

Over the years, there have been campaigns to include the African languages in the growing research on machine translation (MT) in particular, and natural language processing (NLP) in general.

Diversity Machine Translation +1

FFR v1.1: Fon-French Neural Machine Translation

1 code implementation14 Jun 2020 Bonaventure F. P. Dossou, Chris C. Emezue

All over the world and especially in Africa, researchers are putting efforts into building Neural Machine Translation (NMT) systems to help tackle the language barriers in Africa, a continent of over 2000 different languages.

Machine Translation NMT +1

Cannot find the paper you are looking for? You can Submit a new open access paper.