Search Results for author: Orevaoghene Ahia

Found 17 papers, 11 papers with code

MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling

no code implementations15 Mar 2024 Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer

A major consideration in multilingual language modeling is how to best represent languages with diverse vocabularies and scripts.

Language Modelling

Extracting Lexical Features from Dialects via Interpretable Dialect Classifiers

1 code implementation27 Feb 2024 Roy Xie, Orevaoghene Ahia, Yulia Tsvetkov, Antonios Anastasopoulos

Identifying linguistic differences between dialects of a language often requires expert knowledge and meticulous human analysis.

That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

1 code implementation23 Oct 2023 Jaechan Lee, Alisa Liu, Orevaoghene Ahia, Hila Gonen, Noah A. Smith

In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences.

Translation

Do All Languages Cost the Same? Tokenization in the Era of Commercial Language Models

no code implementations23 May 2023 Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Jungo Kasai, David R. Mortensen, Noah A. Smith, Yulia Tsvetkov

Language models have graduated from being research prototypes to commercialized products offered as web APIs, and recent works have highlighted the multilingual capabilities of these products.

Fairness Language Modelling

What a Creole Wants, What a Creole Needs

no code implementations LREC 2022 Heather Lent, Kelechi Ogueji, Miryam de Lhoneux, Orevaoghene Ahia, Anders Søgaard

We demonstrate, through conversations with Creole experts and surveys of Creole-speaking communities, how the things needed from language technology can change dramatically from one language to another, even when the languages are considered to be very similar to each other, as with Creoles.

MasakhaNER: Named Entity Recognition for African Languages

2 code implementations22 Mar 2021 David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.

named-entity-recognition Named Entity Recognition +2

PidginUNMT: Unsupervised Neural Machine Translation from West African Pidgin to English

2 code implementations7 Dec 2019 Kelechi Ogueji, Orevaoghene Ahia

In this work, we perform the first NLP work on the most popular variant of the language, providing three major contributions.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.