no code implementations • SMM4H (COLING) 2022 • Atnafu Lambebo Tonja, Olumide Ebenezer Ojo, Mohammed Arif Khan, Abdul Gafar Manuel Meque, Olga Kolesnikova, Grigori Sidorov, Alexander Gelbukh
This paper describes our submissions for the Social Media Mining for Health (SMM4H) 2022 shared tasks.
no code implementations • 1 Apr 2025 • Girma Yohannis Bade, Zahra Ahani, Olga Kolesnikova, José Luis Oropeza, Grigori Sidorov
The increasing misuse of social media has become a concern; however, technological solutions are being developed to moderate its content effectively.
no code implementations • 30 Mar 2025 • Mikhail Krasitskii, Olga Kolesnikova, Liliana Chanona Hernandez, Grigori Sidorov, Alexander Gelbukh
The sentiment analysis task in Tamil-English code-mixed texts has been explored using advanced transformer-based models.
no code implementations • 24 Mar 2025 • Tadesse Destaw Belay, Dawit Ketema Gete, Abinew Ali Ayele, Olga Kolesnikova, Grigori Sidorov, Seid Muhie Yimam
As users express different emotions simultaneously in a single instance, annotating emotions in a multilabel setting such as the EthioEmo (Belay et al., 2025) dataset effectively captures this dynamic.
no code implementations • 12 Mar 2025 • Tadesse Destaw Belay, Ahmed Haj Ahmed, Alvin Grissom II, Iqra Ameer, Grigori Sidorov, Olga Kolesnikova, Seid Muhie Yimam
We use this benchmark to evaluate several state-of-the-art LLMs on culture-aware emotion prediction and sentiment analysis tasks.
no code implementations • 21 Jan 2025 • Mikhail Krasitskii, Olga Kolesnikova, Liliana Chanona Hernandez, Grigori Sidorov, Alexander Gelbukh
This study explores transformer-based models such as BERT, mBERT, and XLM-R for multi-lingual sentiment analysis across diverse linguistic structures.
no code implementations • 6 Jan 2025 • Olga Kolesnikova, Moein Shahiki Tash, Zahra Ahani, Ameeta Agrawal, Raul Monroy, Grigori Sidorov
Additionally, we achieved a 0. 4\% increase in the macro F1 score for the second task and a 0. 7\% increase for the third task, compared to previous work utilizing traditional machine learning with psycholinguistic and unigram-based TF-IDF values.
no code implementations • 17 Dec 2024 • Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Grigori Sidorov, Dietrich Klakow, Philipp Slusallek, Olga Kolesnikova, Seid Muhie Yimam
The results show that accurate multi-label emotion classification is still insufficient even for high-resource languages such as English, and there is a large gap between the performance of high-resource and low-resource languages.
no code implementations • 3 Oct 2024 • Mesay Gemeda Yigezu, Melkamu Abay Mersha, Girma Yohannis Bade, Jugal Kalita, Olga Kolesnikova, Alexander Gelbukh
Assessment of the outcomes of the experiments shows that the ensemble learning approach has the highest accuracy, achieving a 0. 99 F1 score.
no code implementations • 4 Sep 2024 • Moein Shahiki Tash, Zahra Ahani, Mohim Tash, Olga Kolesnikova, Grigori Sidorov
This study performs analysis of Predictive statements, Hope speech, and Regret Detection behaviors within cryptocurrency-related discussions, leveraging advanced natural language processing techniques.
no code implementations • 6 May 2024 • Moein Shahiki Tash, Zahra Ahani, Olga Kolesnikova, Grigori Sidorov
This study delves into the relationship between emotional trends from X platform data and the market dynamics of well-known cryptocurrencies Cardano, Binance, Fantom, Matic, and Ripple over the period from October 2022 to March 2023.
no code implementations • 8 Apr 2024 • Atnafu Lambebo Tonja, Fazlourrahman Balouchzahi, Sabur Butt, Olga Kolesnikova, Hector Ceballos, Alexander Gelbukh, Thamar Solorio
The paper focuses on the marginalization of indigenous language communities in the face of rapid technological advancements.
no code implementations • 28 Mar 2024 • Atnafu Lambebo Tonja, Olga Kolesnikova, Alexander Gelbukh, Jugal Kalita
Recent research in natural language processing (NLP) has achieved impressive performance in tasks such as machine translation (MT), news classification, and question-answering in high-resource languages.
no code implementations • 20 Mar 2024 • Atnafu Lambebo Tonja, Israel Abebe Azime, Tadesse Destaw Belay, Mesay Gemeda Yigezu, Moges Ahmed Mehamed, Abinew Ali Ayele, Ebrahim Chekol Jibril, Michael Melese Woldeyohannis, Olga Kolesnikova, Philipp Slusallek, Dietrich Klakow, Shengwu Xiong, Seid Muhie Yimam
We open-source our multilingual language models, new benchmark datasets for various downstream tasks, and task-specific fine-tuned language models and discuss the performance of the models.
no code implementations • 8 Dec 2023 • Atnafu Lambebo Tonja, Melkamu Mersha, Ananya Kalita, Olga Kolesnikova, Jugal Kalita
This paper presents the creation of initial bilingual corpora for thirteen very low-resource languages of India, all from Northeast India.
no code implementations • 7 Nov 2023 • Yevhen Kostiuk, Grigori Sidorov, Olga Kolesnikova
In this paper, we present a dataset of most frequent Spanish verb-noun collocations and sentences where they occur, each collocation is assigned to one of 37 lexical functions defined as classes for a hierarchical classification task.
no code implementations • 2 Jun 2023 • Yevhen Kostiuk, Atnafu Lambebo Tonja, Grigori Sidorov, Olga Kolesnikova
In this paper, we investigate the issue of hate speech by presenting a novel task of translating hate speech into non-hate speech text while preserving its meaning.
no code implementations • 27 May 2023 • Atnafu Lambebo Tonja, Hellina Hailu Nigatu, Olga Kolesnikova, Grigori Sidorov, Alexander Gelbukh, Jugal Kalita
This paper describes CIC NLP's submission to the AmericasNLP 2023 Shared Task on machine translation systems for indigenous languages of the Americas.
no code implementations • 27 May 2023 • Atnafu Lambebo Tonja, Christian Maldonado-Sifuentes, David Alejandro Mendoza Castillo, Olga Kolesnikova, Noé Castro-Sánchez, Grigori Sidorov, Alexander Gelbukh
In this paper, we present a parallel Spanish-Mazatec and Spanish-Mixtec corpus for machine translation (MT) tasks, where Mazatec and Mixtec are two indigenous Mexican languages.
1 code implementation • 25 Mar 2023 • Atnafu Lambebo Tonja, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, Seid Muhie Yimam
This survey delves into the current state of natural language processing (NLP) for four Ethiopian languages: Amharic, Afaan Oromo, Tigrinya, and Wolaytta.
no code implementations • 26 Nov 2022 • Atnafu Lambebo Tonja, Mesay Gemeda Yigezu, Olga Kolesnikova, Moein Shahiki Tash, Grigori Sidorov, Alexander Gelbuk
Using code-mixed data in natural language processing (NLP) research currently gets a lot of attention.
2 code implementations • 27 Oct 2022 • Tadesse Destaw Belay, Atnafu Lambebo Tonja, Olga Kolesnikova, Seid Muhie Yimam, Abinew Ali Ayele, Silesh Bogale Haile, Grigori Sidorov, Alexander Gelbukh
Machine translation (MT) is one of the main tasks in natural language processing whose objective is to translate texts automatically from one natural language to another.