1 code implementation • COLING (MWE) 2020 • Beatriz Fisas, Luis Espinosa Anke, Joan Codina-Filbá, Leo Wanner
Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike.
1 code implementation • MSR (COLING) 2020 • Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner
As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.
1 code implementation • ACL (WOAH) 2021 • Alexander Shvets, Paula Fortuna, Juan Soler, Leo Wanner
Mainstream research on hate speech focused so far predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse-grained hate speech categories.
no code implementations • ACL (WebNLG, INLG) 2020 • Simon Mille, Spyridon Symeonidis, Maria Rousi, Montserrat Marimon Felipe, Klearchos Stavrothanasopoulos, Petros Alvanitopoulos, Roberto Carlini Salguero, Jens Grivolla, Georgios Meditskos, Stefanos Vrochidis, Leo Wanner
In this paper, we present a pipeline system that generates architectural landmark descriptions using textual, visual and structured data.
no code implementations • ACL (NLP4PosImpact) 2021 • Paula Fortuna, Laura Pérez-Mayos, Ahmed Abura’Ed, Juan Soler-Company, Leo Wanner
Based on a list of keywords retrieved from the literature and revised in view of the task, we select from this corpus articles that can be considered to be on NLP4SG according to our definition and analyze them in terms of trends along the time line, etc.
1 code implementation • 14 Oct 2024 • Yiping Jin, Leo Wanner, Aneesh Moideen Koya
Experiments on popular industrial and academic models demonstrate that HS detectors assign a higher hatefulness score merely based on the mention of specific target identities.
1 code implementation • 23 Feb 2024 • Yiping Jin, Leo Wanner, Alexander Shvets
A recent proposal in this direction is HateCheck, a suite for testing fine-grained model functionalities on synthesized data generated using templates of the kind "You are just a [slur] to me."
no code implementations • 22 Aug 2023 • Despoina Chatzakou, Juan Soler-Company, Theodora Tsikrika, Leo Wanner, Stefanos Vrochidis, Ioannis Kompatsiaris
Social media users often hold several accounts in their effort to multiply the spread of their thoughts, ideas, and viewpoints.
no code implementations • 4 May 2023 • Yiping Jin, Leo Wanner, Vishakha Laxman Kadam, Alexander Shvets
As pointed out by several scholars, current research on hate speech (HS) recognition is characterized by unsystematic data creation strategies and diverging annotation schemata.
no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.
no code implementations • *SEM (NAACL) 2022 • Luis Espinosa-Anke, Alexander Shvets, Alireza Mohammadshahi, James Henderson, Leo Wanner
Recognizing and categorizing lexical collocations in context is useful for language learning, dictionary compilation and downstream NLP.
no code implementations • EMNLP 2021 • Laura Pérez-Mayos, Miguel Ballesteros, Leo Wanner
This calls for a study of the impact of pretraining data size on the knowledge of the models.
no code implementations • Findings (ACL) 2021 • Laura Pérez-Mayos, Alba Táboas García, Simon Mille, Leo Wanner
More specifically, we evaluate the syntactic generalization potential of the models on English and Spanish tests, comparing the syntactic abilities of monolingual and multilingual models on the same language (English), and of multilingual models on two different languages (English and Spanish).
1 code implementation • EACL 2021 • Luis Espinosa Anke, Joan Codina-Filba, Leo Wanner
We first construct a dataset of apparitions of lexical collocations in context, categorized into 17 representative semantic categories.
no code implementations • EACL 2021 • Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner
The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations.
1 code implementation • 25 Aug 2020 • Alexander Shvets, Leo Wanner
Concept extraction is crucial for a number of downstream applications.
no code implementations • LREC 2020 • Paula Fortuna, Juan Soler, Leo Wanner
The field of the automatic detection of hate speech and related concepts has raised a lot of interest in the last years.
no code implementations • LREC 2020 • Monica Dominguez, Juan Soler, Leo Wanner
This paper introduces ThemePro, a toolkit for the automatic analysis of thematic progression.
no code implementations • WS 2019 • Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner
We report results from the SR{'}19 Shared Task, the second edition of a multilingual surface realisation task organised as part of the EMNLP{'}19 Workshop on Multilingual Surface Realisation.
no code implementations • WS 2019 • Simon Mille, Stamatia Dasiopoulou, Beatriz Fisas, Leo Wanner
Statistical generators increasingly dominate the research in NLG.
1 code implementation • WS 2019 • Paula Fortuna, Jo{\~a}o Rocha da Silva, Juan Soler-Company, Leo Wanner, S{\'e}rgio Nunes
Firstly, non-experts annotated the tweets with binary labels ({`}hate{'} vs. {`}no-hate{'}).
1 code implementation • ACL 2019 • Luis Espinosa Anke, Steven Schockaert, Leo Wanner
Lexical relation classification is the task of predicting whether a certain relation holds between a given pair of words.
no code implementations • WS 2018 • Alex Shvets, er, Simon Mille, Leo Wanner
An increasing amount of research tackles the challenge of text generation from abstract ontological or semantic structures, which are in their very nature potentially large connected graphs.
no code implementations • WS 2018 • Simon Mille, Anja Belz, Bernd Bohnet, Leo Wanner
In this paper, we present the datasets used in the Shallow and Deep Tracks of the First Multilingual Surface Realisation Shared Task (SR{'}18).
no code implementations • WS 2018 • Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner
We report results from the SR{'}18 Shared Task, a new multilingual surface realisation task organised as part of the ACL{'}18 Workshop on Multilingual Surface Realisation.
no code implementations • WS 2017 • Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz
We propose a shared task on multilingual Surface Realization, i. e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages.
no code implementations • WS 2017 • Simon Mille, Leo Wanner
This demo paper presents the multilingual deep sentence generator developed by the TALN group at Universitat Pompeu Fabra, implemented as a series of rule-based graph-transducers for the syntacticization of the input graphs, the resolution of morphological agreements, and the linearization of the trees.
no code implementations • SEMEVAL 2017 • Simon Mille, Roberto Carlini, Alicia Burga, Leo Wanner
We present the contribution of Universitat Pompeu Fabra{'}s NLP group to the SemEval Task 9. 2 (AMR-to-English Generation).
no code implementations • WS 2017 • Alp {\"O}ktem, Mireia Farr{\'u}s, Leo Wanner
This paper presents a methodology to extract parallel speech corpora based on any language pair from dubbed movies, together with an application framework in which some corresponding prosodic parameters are extracted.
no code implementations • EACL 2017 • Juan Soler-Company, Leo Wanner
The majority of approaches to author profiling and author identification focus mainly on lexical features, i. e., on the content of a text.
1 code implementation • COLING 2016 • M{\'o}nica Dom{\'\i}nguez, Mireia Farr{\'u}s, Leo Wanner
Speech prosody is known to be central in advanced communication technologies.
no code implementations • COLING 2016 • Luis Espinosa-Anke, Jose Camacho-Collados, Sara Rodr{\'\i}guez-Fern{\'a}ndez, Horacio Saggion, Leo Wanner
WordNet is probably the best known lexical resource in Natural Language Processing.
no code implementations • COLING 2016 • M{\'o}nica Dom{\'\i}nguez, Iv{\'a}n Latorre, Mireia Farr{\'u}s, Joan Codina-Filb{\`a}, Leo Wanner
This paper presents an implementation of the widely used speech analysis tool Praat as a web application with an extended functionality for feature annotation.
no code implementations • LREC 2016 • Alicia Burga, Sergio Cajal, Joan Codina-Filb{\`a}, Leo Wanner
Despite the popularity of coreference resolution as a research topic, the overwhelming majority of the work in this area focused so far on single antecedence coreference only.
no code implementations • LREC 2016 • Sara Rodr{\'\i}guez-Fern{\'a}ndez, Roberto Carlini, Luis Espinosa Anke, Leo Wanner
Collocations such as {``}heavy rain{''} or {``}make [a] decision{''}, are combinations of two elements where one (the base) is freely chosen, while the choice of the other (collocate) is restricted, depending on the base.
no code implementations • LREC 2016 • Juan Soler, Leo Wanner
In most of the research studies on Author Profiling, large quantities of correctly labeled data are used to train the models.
no code implementations • LREC 2014 • Juan Soler Company, Leo Wanner
Over the last years, author profiling in general and author gender identification in particular have become a popular research area due to their potential attractive applications that range from forensic investigations to online marketing studies.
no code implementations • LREC 2014 • Nadjet Bouayad-Agha, Alicia Burga, Gerard Casamayor, Joan Codina, Rogelio Nazar, Leo Wanner
The Stanford Coreference Resolution System (StCR) is a multi-pass, rule-based system that scored best in the CoNLL 2011 shared task on general discourse coreference resolution.