no code implementations • NAACL (BEA) 2022 • Kai North, Marcos Zampieri, Matthew Shardlow
Identifying complex words in texts is an important first step in text simplification (TS) systems.
no code implementations • 22 Feb 2024 • Kai North, Tharindu Ranasinghe, Matthew Shardlow, Marcos Zampieri
We present MultiLS, the first LS framework that allows for the creation of a multi-task LS dataset.
no code implementations • 26 Jan 2024 • Md Mushfiqur Rahman, Mohammad Sabik Irbaz, Kai North, Michelle S. Williams, Marcos Zampieri, Kevin Lybarger
Our innovative RLHF reward function surpassed existing RL text simplification reward functions in effectiveness.
no code implementations • 25 Nov 2023 • Md Nishat Raihan, Umma Hani Tanmoy, Anika Binte Islam, Kai North, Tharindu Ranasinghe, Antonios Anastasopoulos, Marcos Zampieri
Identifying offensive content in social media is vital for creating safe online communities.
no code implementations • 31 May 2023 • Noëmi Aepli, Çağrı Çöltekin, Rob van der Goot, Tommi Jauhiainen, Mourhaf Kazzaz, Nikola Ljubešić, Kai North, Barbara Plank, Yves Scherrer, Marcos Zampieri
This report presents the results of the shared tasks organized as part of the VarDial Evaluation Campaign 2023.
no code implementations • 19 May 2023 • Kai North, Tharindu Ranasinghe, Matthew Shardlow, Marcos Zampieri
To reflect these recent advances, we present a comprehensive survey of papers published between 2017 and 2023 on LS and its sub-tasks with a special focus on deep learning.
no code implementations • 8 Mar 2023 • Kai North, Marcos Zampieri, Matthew Shardlow
Finally, we include brief sections on applications of lexical complexity prediction, such as readability and text simplification, together with related studies on languages other than English.
1 code implementation • 2 Mar 2023 • Marcos Zampieri, Kai North, Tommi Jauhiainen, Mariano Felice, Neha Kumari, Nishant Nair, Yash Bangera
Research has shown that this is a problematic assumption, particularly in the case of very similar languages (e. g., Croatian and Serbian) and national language varieties (e. g., Brazilian and European Portuguese), where texts may contain no distinctive marker of the particular language or variety.
no code implementations • 6 Feb 2023 • Horacio Saggion, Sanja Štajner, Daniel Ferrés, Kim Cheng SHEANG, Matthew Shardlow, Kai North, Marcos Zampieri
We report findings of the TSAR-2022 shared task on multilingual lexical simplification, organized as part of the Workshop on Text Simplification, Accessibility, and Readability TSAR-2022 held in conjunction with EMNLP 2022.
no code implementations • 18 Nov 2022 • Tharindu Ranasinghe, Kai North, Damith Premasiri, Marcos Zampieri
The widespread of offensive content online has become a reason for great concern in recent years, motivating researchers to develop robust systems capable of identifying such content automatically.
no code implementations • COLING 2022 • Kai North, Marcos Zampieri, Tharindu Ranasinghe
To continue improving the performance of LS systems we introduce ALEXSIS-PT, a novel multi-candidate dataset for Brazilian Portuguese LS containing 9, 605 candidate substitutions for 387 complex words.
2 code implementations • 12 Sep 2022 • Sanja Stajner, Daniel Ferres, Matthew Shardlow, Kai North, Marcos Zampieri, Horacio Saggion
To showcase the usability of the dataset, we adapt two state-of-the-art lexical simplification systems with differing architectures (neural vs.\ non-neural) to all three languages (English, Spanish, and Brazilian Portuguese) and evaluate their performances on our new dataset.
no code implementations • SEMEVAL 2021 • Abhinandan Desai, Kai North, Marcos Zampieri, Christopher M. Homan
This paper describes team LCP-RIT's submission to the SemEval-2021 Task 1: Lexical Complexity Prediction (LCP).