Search Results for author: Iroro Orife

Found 15 papers, 9 papers with code

A Generalized Bandsplit Neural Network for Cinematic Audio Source Separation

1 code implementation • 5 Sep 2023 • Karn N. Watcharasupat, Chih-Wei Wu, Yiwei Ding, Iroro Orife, Aaron J. Hipple, Phillip A. Williams, Scott Kramer, Alexander Lerch, William Wolcott

Cinematic audio source separation is a relatively new subtask of audio source separation, with the aim of extracting the dialogue, music, and effects stems from their mixture.

Audio Source Separation

Paper
Code

ÌròyìnSpeech: A multi-purpose Yorùbá Speech Corpus

1 code implementation • 29 Jul 2023 • Tolulope Ogunremi, Kola Tubosun, Anuoluwapo Aremu, Iroro Orife, David Ifeoluwa Adelani

To encourage a participatory approach to data creation, we provide 5000 curated sentences to the Mozilla Common Voice platform to crowd-source the recording and validation of Yor\`{u}b\'{a} speech data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Looking Similar, Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

no code implementations • 12 Apr 2023 • Nikhil Singh, Chih-Wei Wu, Iroro Orife, Mahdi Kalayeh

We additionally compare this approach to a strong baseline where we remove speech before pretraining, and find that dub-augmented training is more effective, including for paralinguistic and audiovisual tasks where speech removal leads to worse performance.

Contrastive Learning counterfactual +1

Paper
Add Code

BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus

1 code implementation • 7 Jul 2022 • Josh Meyer, David Ifeoluwa Adelani, Edresson Casanova, Alp Öktem, Daniel Whitenack Julian Weber, Salomon Kabongo, Elizabeth Salesky, Iroro Orife, Colin Leong, Perez Ogayo, Chris Emezue, Jonathan Mukiibi, Salomey Osei, Apelete Agbolo, Victor Akinode, Bernard Opoku, Samuel Olanrewaju, Jesujoba Alabi, Shamsuddeen Muhammad

BibleTTS is a large, high-quality, open speech dataset for ten languages spoken in Sub-Saharan Africa.

Vocal Bursts Intensity Prediction

Paper
Code

Learning Nigerian accent embeddings from speech: preliminary results based on SautiDB-Naija corpus

no code implementations • 12 Dec 2021 • Tejumade Afonja, Oladimeji Mudele, Iroro Orife, Kenechi Dukor, Lawrence Francis, Duru Goodness, Oluwafemi Azeez, Ademola Malomo, Clinton Mbataku

We describe how the corpus was created and curated as well as preliminary experiments with accent classification and learning Nigerian accent embeddings.

Classification

Paper
Add Code

AVASpeech-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-Occurrence

1 code implementation • 2 Nov 2021 • Yun-Ning Hung, Karn N. Watcharasupat, Chih-Wei Wu, Iroro Orife, Kelian Li, Pavan Seshadri, Junyoung Lee

We propose a dataset, AVASpeech-SMAD, to assist speech and music activity detection research.

Action Detection Activity Detection

Paper
Code

MasakhaNER: Named Entity Recognition for African Languages

2 code implementations • 22 Mar 2021 • David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

Paper
Code

Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets

no code implementations • 22 Mar 2021 • Julia Kreutzer, Isaac Caswell, Lisa Wang, Ahsan Wahab, Daan van Esch, Nasanbayar Ulzii-Orshikh, Allahsera Tapo, Nishant Subramani, Artem Sokolov, Claytone Sikasote, Monang Setyawan, Supheakmungkol Sarin, Sokhar Samb, Benoît Sagot, Clara Rivera, Annette Rios, Isabel Papadimitriou, Salomey Osei, Pedro Ortiz Suarez, Iroro Orife, Kelechi Ogueji, Andre Niyongabo Rubungo, Toan Q. Nguyen, Mathias Müller, André Müller, Shamsuddeen Hassan Muhammad, Nanda Muhammad, Ayanda Mnyakeni, Jamshidbek Mirzakhalov, Tapiwanashe Matangira, Colin Leong, Nze Lawson, Sneha Kudugunta, Yacine Jernite, Mathias Jenny, Orhan Firat, Bonaventure F. P. Dossou, Sakhile Dlamini, Nisansa de Silva, Sakine Çabuk Ballı, Stella Biderman, Alessia Battisti, Ahmed Baruwa, Ankur Bapna, Pallavi Baljekar, Israel Abebe Azime, Ayodele Awokoya, Duygu Ataman, Orevaoghene Ahia, Oghenefego Ahia, Sweta Agrawal, Mofetoluwa Adeyemi

With the success of large-scale pre-training and multilingual modeling in Natural Language Processing (NLP), recent years have seen a proliferation of large, web-mined text datasets covering hundreds of languages.

Paper
Add Code

Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages

4 code implementations • Findings of the Association for Computational Linguistics 2020 • Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Tajudeen Kolawole, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddeen Hassan Muhammad, Salomon Kabongo, Salomey Osei, Sackey Freshia, Rubungo Andre Niyongabo, Ricky Macharm, Perez Ogayo, Orevaoghene Ahia, Musie Meressa, Mofe Adeyemi, Masabata Mokgesi-Selinga, Lawrence Okegbemi, Laura Jane Martinus, Kolawole Tajudeen, Kevin Degila, Kelechi Ogueji, Kathleen Siminyu, Julia Kreutzer, Jason Webster, Jamiil Toure Ali, Jade Abbott, Iroro Orife, Ignatius Ezeani, Idris Abdulkabir Dangana, Herman Kamper, Hady Elsahar, Goodness Duru, Ghollah Kioko, Espoir Murhabazi, Elan van Biljon, Daniel Whitenack, Christopher Onyefuluchi, Chris Emezue, Bonaventure Dossou, Blessing Sibanda, Blessing Itoro Bassey, Ayodele Olabiyi, Arshath Ramkilowan, Alp Öktem, Adewale Akinfaderin, Abdallah Bashir

Research in NLP lacks geographic diversity, and the question of how NLP can be scaled to low-resourced languages has not yet been adequately solved.

Machine Translation Translation

656

Paper
Code

Towards Neural Machine Translation for Edoid Languages

no code implementations • 24 Mar 2020 • Iroro Orife

Many Nigerian languages have relinquished their previous prestige and purpose in modern society to English and Nigerian Pidgin.

Machine Translation NMT +1

Paper
Add Code

Improving Yorùbá Diacritic Restoration

1 code implementation • 23 Mar 2020 • Iroro Orife, David I. Adelani, Timi Fasubaa, Victor Williamson, Wuraola Fisayo Oyewusi, Olamilekan Wahab, Kola Tubosun

Yor\`ub\'a is a widely spoken West African language with a writing system rich in orthographic and tonal diacritics.

Paper
Code

Masakhane -- Machine Translation For Africa

2 code implementations • 13 Mar 2020 • Iroro Orife, Julia Kreutzer, Blessing Sibanda, Daniel Whitenack, Kathleen Siminyu, Laura Martinus, Jamiil Toure Ali, Jade Abbott, Vukosi Marivate, Salomon Kabongo, Musie Meressa, Espoir Murhabazi, Orevaoghene Ahia, Elan van Biljon, Arshath Ramkilowan, Adewale Akinfaderin, Alp Öktem, Wole Akin, Ghollah Kioko, Kevin Degila, Herman Kamper, Bonaventure Dossou, Chris Emezue, Kelechi Ogueji, Abdallah Bashir

Africa has over 2000 languages.

Machine Translation Translation

268

Paper
Code

The Marchex 2018 English Conversational Telephone Speech Recognition System

no code implementations • 5 Nov 2018 • Seongjun Hahm, Iroro Orife, Shane Walker, Jason Flaks

In this paper, we describe recent performance improvements to the production Marchex speech recognition system for our spontaneous customer-to-business telephone conversations.

Language Modelling speech-recognition +1

Paper
Add Code

Attentive Sequence-to-Sequence Learning for Diacritic Restoration of Yorùbá Language Text

1 code implementation • 3 Apr 2018 • Iroro Orife

Yor\`ub\'a is a widely spoken West African language with a writing system rich in tonal and orthographic diacritics.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Semi-Supervised Model Training for Unbounded Conversational Speech Recognition

no code implementations • 26 May 2017 • Shane Walker, Morten Pedersen, Iroro Orife, Jason Flaks

For conversational large-vocabulary continuous speech recognition (LVCSR) tasks, up to about two thousand hours of audio is commonly used to train state of the art models.

Language Modelling speech-recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.