Search Results for author: Vukosi Marivate

Found 35 papers, 16 papers with code

Practical Approach on Implementation of WordNets for South African Languages

1 code implementation EACL (GWC) 2021 Tshephisho Joseph Sefara, Tumisho Billson Mokgonyane, Vukosi Marivate

This paper proposes the implementation of WordNets for five South African languages, namely, Sepedi, Setswana, Tshivenda, isiZulu and isiXhosa to be added to open multilingual WordNets (OMW) on natural language toolkit (NLTK).

LiSTra Automatic Speech Translation: English to Lingala Case Study

no code implementations DCLRL (LREC) 2022 Salomon Kabongo Kabenamualu, Vukosi Marivate, Herman Kamper

In recent years there has been great interest in addressing the data scarcity of African languages and providing baseline models for different Natural Language Processing tasks (Orife et al., 2020).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot

no code implementations26 Apr 2024 Michelle Terblanche, Kayode Olaleye, Vukosi Marivate

We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.

Multimodal Misinformation Detection in a South African Social Media Environment

no code implementations7 Dec 2023 Amica De Jager, Vukosi Marivate, Abioudun Modupe

This research contributes a multimodal MD model capable of functioning in the South African social media environment, as well as introduces a South African misinformation dataset.


PuoBERTa: Training and evaluation of a curated language model for Setswana

2 code implementations13 Oct 2023 Vukosi Marivate, Moseli Mots'oehli, Valencia Wagner, Richard Lastrucci, Isheanesu Dzingirai

Natural language processing (NLP) has made significant progress for well-resourced languages such as English but lagged behind for low-resource languages like Setswana.

Language Modelling named-entity-recognition +5

Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting

no code implementations30 Sep 2023 Baphumelele Masikisiki, Vukosi Marivate, Yvette Hlope

To address this challenge, Chain of Thought(CoT) prompting method has been proposed as a means to enhance LLMs' proficiency in complex reasoning tasks like solving math word problems and answering questions based on logical argumentative reasoning.


Integrating Bidirectional Long Short-Term Memory with Subword Embedding for Authorship Attribution

no code implementations26 Jun 2023 Abiodun Modupe, Turgay Celik, Vukosi Marivate, Oludayo O. Olugbara

However, character-based methods often fail to capture the sequential relationship of words in texts which is a chasm for further improvement.

Authorship Attribution

Textual Augmentation Techniques Applied to Low Resource Machine Translation: Case of Swahili

no code implementations12 Jun 2023 Catherine Gitau, Vukosi Marivate

In this work we investigate the impact of applying textual data augmentation tasks to low resource machine translation.

Data Augmentation Machine Translation +4

Izindaba-Tindzaba: Machine learning news categorisation for Long and Short Text for isiZulu and Siswati

1 code implementation12 Jun 2023 Andani Madodonga, Vukosi Marivate, Matthew Adendorff

Due to the shortage of data for these native South African languages, the datasets that were created were augmented and oversampled to increase data size and overcome class classification imbalance.

Classification regression +2

MphayaNER: Named Entity Recognition for Tshivenda

1 code implementation8 Apr 2023 Rendani Mbuvha, David I. Adelani, Tendani Mutavhatsindi, Tshimangadzo Rakhuhu, Aluwani Mauda, Tshifhiwa Joshua Maumela, Andisani Masindi, Seani Rananga, Vukosi Marivate, Tshilidzi Marwala

Named Entity Recognition (NER) plays a vital role in various Natural Language Processing tasks such as information retrieval, text classification, and question answering.

Information Retrieval named-entity-recognition +6

Conversational Pattern Mining using Motif Detection

no code implementations13 Nov 2022 Nicolle Garber, Vukosi Marivate

The subject of conversational mining has become of great interest recently due to the explosion of social and other online media.

Reinforcement Learning in Education: A Multi-Armed Bandit Approach

no code implementations1 Nov 2022 Herkulaas Combrink, Vukosi Marivate, Benjamin Rosman

Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment.

reinforcement-learning Reinforcement Learning (RL)

A Framework for Undergraduate Data Collection Strategies for Student Support Recommendation Systems in Higher Education

no code implementations16 Oct 2022 Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman

While much effort and detail has gone into the expansion of explaining algorithmic decision making in this context, there is still a need to develop data collection strategies Therefore, the purpose of this paper is to outline a data collection framework specific to recommender systems within this context in order to reduce collection biases, understand student characteristics, and find an ideal way to infer optimal influences on the student journey.

Decision Making Recommendation Systems

Semi-supervised learning approaches for predicting South African political sentiment for local government elections

no code implementations4 May 2022 Mashadi Ledwaba, Vukosi Marivate

This study aims to understand the South African political context by analysing the sentiments shared on Twitter during the local government elections.

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation

Training Cross-Lingual embeddings for Setswana and Sepedi

1 code implementation11 Nov 2021 Mack Makgatho, Vukosi Marivate, Tshephisho Sefara, Valencia Wagner

This paper trains Setswana and Sepedi monolingual word vectors and uses VecMap to create cross-lingual embeddings for Setswana-Sepedi in order to do a cross-lingual transfer.

Cross-Lingual Transfer Semantic Similarity +2

An empirical investigation into audio pipeline approaches for classifying bird species

no code implementations10 Aug 2021 David Behr, Ciira wa Maina, Vukosi Marivate

This paper is an investigation into aspects of an audio classification pipeline that will be appropriate for the monitoring of bird species on edges devices.

Audio Classification Data Augmentation +2

Towards Financial Sentiment Analysis in a South African Landscape

no code implementations18 Jun 2021 Michelle Terblanche, Vukosi Marivate

Sentiment analysis as a sub-field of natural language processing has received increased attention in the past decade enabling organisations to more effectively manage their reputation through online media monitoring.

Sentiment Analysis

1st AfricaNLP Workshop Proceedings, 2020

no code implementations20 Nov 2020 Kathleen Siminyu, Laura Martinus, Vukosi Marivate

Proceedings of the 1st AfricaNLP Workshop held on 26th April alongside ICLR 2020, Virtual Conference, Formerly Addis Ababa Ethiopia.

AI4D -- African Language Dataset Challenge

no code implementations23 Jul 2020 Kathleen Siminyu, Sackey Freshia, Jade Abbott, Vukosi Marivate

This work details the organisation of the AI4D - African Language Dataset Challenge, an effort to incentivize the creation, organization and discovery of African language datasets through a competitive challenge.

BIG-bench Machine Learning

Mapping the South African health landscape in response to COVID-19

1 code implementation26 Jun 2020 Nompumelelo Mtsweni, Herkulaas MvE Combrink, Vukosi Marivate

When the COVID-19 disease pandemic infiltrated the world, there was an immediate need for accurate information.

Computers and Society

Investigating similarities and differences between South African and Sierra Leonean school outcomes using Machine Learning

no code implementations22 Apr 2020 Henry Wandera, Vukosi Marivate, David Sengeh

Available or adequate information to inform decision making for resource allocation in support of school improvement is a critical issue globally.

BIG-bench Machine Learning Decision Making

A Framework For Sharing Publicly Available Data To Inform The COVID-19 Outbreak in Africa: A South African Case Study

1 code implementation2 Apr 2020 Vukosi Marivate, Herkulaas MvE Combrink

These announcements narrate the confirmed COVID-19 cases and include the age, gender, and travel history of people who have tested positive for the disease.

Computers and Society Applications

Cannot find the paper you are looking for? You can Submit a new open access paper.