no code implementations • WS 2018 • Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo H. Paetzold, Lucia Specia, Sanja Štajner, Anaïs Tack, Marcos Zampieri
We report the findings of the second Complex Word Identification (CWI) shared task organized as part of the BEA workshop co-located with NAACL-HLT'2018.
no code implementations • COLING 2018 • Seid Muhie Yimam, Chris Biemann
Learning from a real-world data stream and continuously updating the model without explicit supervision is a new challenge for NLP applications with machine learning components.
no code implementations • 13 Jul 2018 • Gregor Wiedemann, Seid Muhie Yimam, Chris Biemann
Investigative journalism in recent years is confronted with two major challenges: 1) vast amounts of unstructured data originating from large text collections such as leaks or answers to Freedom of Information requests, and 2) multi-lingual data due to intensified global cooperation and communication in politics, business and civil society.
no code implementations • EMNLP 2018 • Seid Muhie Yimam, Chris Biemann
In this paper, we present Par4Sem, a semantic writing aid tool based on adaptive paraphrasing.
no code implementations • EMNLP 2018 • Gregor Wiedemann, Seid Muhie Yimam, Chris Biemann
We introduce an advanced information extraction pipeline to automatically process very large collections of unstructured textual data for the purpose of investigative journalism.
no code implementations • ACL 2016 • Seid Muhie Yimam, Heiner Ulrich, von L, Tatiana esberger, Marcel Rosenbach, Michaela Regneri, Alex Panchenko, er, Franziska Lehmann, Uli Fahrer, Chris Biemann, Kathrin Ballweg
no code implementations • RANLP 2017 • Seid Muhie Yimam, Steffen Remus, Alex Panchenko, er, Andreas Holzinger, Chris Biemann
In this paper, we describe the concept of entity-centric information access for the biomedical domain.
no code implementations • WS 2016 • Richard Eckart de Castilho, {\'E}va M{\'u}jdricza-Maydt, Seid Muhie Yimam, Silvana Hartmann, Iryna Gurevych, Anette Frank, Chris Biemann
We introduce the third major release of WebAnno, a generic web-based annotation tool for distributed teams.
no code implementations • IJCNLP 2017 • Seid Muhie Yimam, Sanja {\v{S}}tajner, Martin Riedl, Chris Biemann
Complex word identification (CWI) is an important task in text accessibility.
no code implementations • RANLP 2017 • Seid Muhie Yimam, Sanja {\v{S}}tajner, Martin Riedl, Chris Biemann
Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility.
no code implementations • 9 Dec 2019 • Seid Muhie Yimam, Abinew Ali Ayele, Chris Biemann
Since several languages can be written using the Fidel script, we have used the existing Amharic, Tigrinya and Ge'ez corpora to retain only the Amharic tweets.
no code implementations • SEMEVAL 2020 • Gregor Wiedemann, Seid Muhie Yimam, Chris Biemann
Fine-tuning of pre-trained transformer networks such as BERT yield state-of-the-art results for text classification tasks.
no code implementations • NAACL 2021 • Sian Gooding, Ekaterina Kochmar, Seid Muhie Yimam, Chris Biemann
Lexical complexity is a highly subjective notion, yet this factor is often neglected in lexical simplification and readability systems which use a {''}one-size-fits-all{''} approach.
no code implementations • NAACL 2021 • Max Wiechmann, Seid Muhie Yimam, Chris Biemann
ActiveAnno is built with extensible design and easy deployment in mind, all to enable users to perform annotation tasks with high efficiency and high-quality annotation results.
no code implementations • COLING 2020 • Seid Muhie Yimam, Hizkiel Mitiku Alemayehu, Abinew Ayele, Chris Biemann
To advance the sentiment analysis research in Amharic and other related low-resource languages, we release the dataset, the annotation tool, source code, and models publicly under a permissive.
no code implementations • EACL 2021 • Christian Haase, Saba Anwar, Seid Muhie Yimam, Alexander Friedrich, Chris Biemann
There are two main approaches to the exploration of dynamic networks: the discrete one compares a series of clustered graphs from separate points in time.
no code implementations • LREC 2022 • Meriem Beloucif, Seid Muhie Yimam, Steffen Stahlhacke, Chris Biemann
Comparative Question Answering (cQA) is the task of providing concrete and accurate responses to queries such as: “Is Lyft cheaper than a regular taxi?” or “What makes a mortgage different from a regular loan?”.
no code implementations • 25 Jan 2023 • Debayan Banerjee, Seid Muhie Yimam, Sushil Awale, Chris Biemann
In this work, we present ARDIAS, a web-based application that aims to provide researchers with a full suite of discovery and collaboration tools.
no code implementations • 12 Feb 2024 • Israel Abebe Azime, Atnafu Lambebo Tonja, Tadesse Destaw Belay, Mitiku Yohannes Fuge, Aman Kassahun Wassie, Eyasu Shiferaw Jada, Yonas Chanie, Walelign Tewabe Sewunetie, Seid Muhie Yimam
We compile an Amharic instruction fine-tuning dataset and fine-tuned LLaMA-2-Amharic model.
no code implementations • 20 Mar 2024 • Atnafu Lambebo Tonja, Israel Abebe Azime, Tadesse Destaw Belay, Mesay Gemeda Yigezu, Moges Ahmed Mehamed, Abinew Ali Ayele, Ebrahim Chekol Jibril, Michael Melese Woldeyohannis, Olga Kolesnikova, Philipp Slusallek, Dietrich Klakow, Shengwu Xiong, Seid Muhie Yimam
We open-source our multilingual language models, new benchmark datasets for various downstream tasks, and task-specific fine-tuned language models and discuss the performance of the models.
1 code implementation • SEMEVAL 2017 • N, Titas i, Chris Biemann, Seid Muhie Yimam, Deepak Gupta, Sarah Kohail, Asif Ekbal, Pushpak Bhattacharyya
In this paper we present the system for Answer Selection and Ranking in Community Question Answering, which we build as part of our participation in SemEval-2017 Task 3.
1 code implementation • 25 Mar 2023 • Atnafu Lambebo Tonja, Tadesse Destaw Belay, Israel Abebe Azime, Abinew Ali Ayele, Moges Ahmed Mehamed, Olga Kolesnikova, Seid Muhie Yimam
This survey delves into the current state of natural language processing (NLP) for four Ethiopian languages: Amharic, Afaan Oromo, Tigrinya, and Wolaytta.
1 code implementation • 18 Apr 2024 • Abinew Ali Ayele, Esubalew Alemneh Jalew, Adem Chanie Ali, Seid Muhie Yimam, Chris Biemann
The prevalence of digital media and evolving sociopolitical dynamics have significantly amplified the dissemination of hateful content.
1 code implementation • KONVENS (WS) 2021 • Niklas von Boguszewski, Sana Moin, Anirban Bhowmick, Seid Muhie Yimam, Chris Biemann
Hence, we show that transfer learning from the social media domain is efficacious in classifying hate and offensive speech in movies through subtitles.
1 code implementation • 2 Nov 2020 • Seid Muhie Yimam, Abinew Ali Ayele, Gopalakrishnan Venkatesh, Chris Biemann
We find that newly trained models perform better than pre-trained multilingual models.
1 code implementation • SIGUL (LREC) 2022 • Tadesse Destaw, Seid Muhie Yimam, Abinew Ayele, Chris Biemann
Questions are posted in Amharic, English, or Amharic but in a Latin script.
2 code implementations • 27 Oct 2022 • Tadesse Destaw Belay, Atnafu Lambebo Tonja, Olga Kolesnikova, Seid Muhie Yimam, Abinew Ali Ayele, Silesh Bogale Haile, Grigori Sidorov, Alexander Gelbukh
Machine translation (MT) is one of the main tasks in natural language processing whose objective is to translate texts automatically from one natural language to another.
1 code implementation • LREC 2020 • Seid Muhie Yimam, Gopalakrishnan Venkatesh, John Sie Yuen Lee, Chris Biemann
The aim is to build a writing aid system that automatically edits a text so that it better adheres to the academic style of writing.
2 code implementations • 13 Feb 2024 • Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine de Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Winata, Seid Muhie Yimam, Saif M. Mohammad
Exploring and quantifying semantic relatedness is central to representing language.
1 code implementation • 27 Mar 2024 • Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Meriem Beloucif, Christine de Kock, Oumaima Hourrane, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Krishnapriya Vishnubhotla, Seid Muhie Yimam, Saif M. Mohammad
We present the first shared task on Semantic Textual Relatedness (STR).
1 code implementation • 13 Apr 2023 • Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Seid Muhie Yimam, David Ifeoluwa Adelani, Ibrahim Sa'id Ahmad, Nedjma Ousidhoum, Abinew Ayele, Saif M. Mohammad, Meriem Beloucif, Sebastian Ruder
We present the first Africentric SemEval Shared task, Sentiment Analysis for African Languages (AfriSenti-SemEval) - The dataset is available at https://github. com/afrisenti-semeval/afrisent-semeval-2023.
2 code implementations • 22 Mar 2021 • David Ifeoluwa Adelani, Jade Abbott, Graham Neubig, Daniel D'souza, Julia Kreutzer, Constantine Lignos, Chester Palen-Michel, Happy Buzaaba, Shruti Rijhwani, Sebastian Ruder, Stephen Mayhew, Israel Abebe Azime, Shamsuddeen Muhammad, Chris Chinenye Emezue, Joyce Nakatumba-Nabende, Perez Ogayo, Anuoluwapo Aremu, Catherine Gitau, Derguene Mbaye, Jesujoba Alabi, Seid Muhie Yimam, Tajuddeen Gwadabe, Ignatius Ezeani, Rubungo Andre Niyongabo, Jonathan Mukiibi, Verrah Otiende, Iroro Orife, Davis David, Samba Ngom, Tosin Adewumi, Paul Rayson, Mofetoluwa Adeyemi, Gerald Muriuki, Emmanuel Anebi, Chiamaka Chukwuneke, Nkiruka Odu, Eric Peter Wairagala, Samuel Oyerinde, Clemencia Siro, Tobius Saul Bateesa, Temilola Oloyede, Yvonne Wambui, Victor Akinode, Deborah Nabagereka, Maurice Katusiime, Ayodele Awokoya, Mouhamadane MBOUP, Dibora Gebreyohannes, Henok Tilaye, Kelechi Nwaike, Degaga Wolde, Abdoulaye Faye, Blessing Sibanda, Orevaoghene Ahia, Bonaventure F. P. Dossou, Kelechi Ogueji, Thierno Ibrahima DIOP, Abdoulaye Diallo, Adewale Akinfaderin, Tendai Marengereke, Salomey Osei
We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders.
6 code implementations • 18 Dec 2020 • Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, Animesh Mukherjee
We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.
Ranked #3 on Hate Speech Detection on HateXplain
3 code implementations • 17 Feb 2023 • Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Sa'id Ahmad, Meriem Beloucif, Saif M. Mohammad, Sebastian Ruder, Oumaima Hourrane, Pavel Brazdil, Felermino Dário Mário António Ali, Davis David, Salomey Osei, Bello Shehu Bello, Falalu Ibrahim, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Steven Arthur
These include 75 languages with at least one million speakers each.