Search Results for author: Leo Wanner

Found 57 papers, 9 papers with code

CollFrEn: Rich Bilingual English–French Collocation Resource

1 code implementation • COLING (MWE) 2020 • Beatriz Fisas, Luis Espinosa Anke, Joan Codina-Filbá, Leo Wanner

Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike.

Machine Translation Relation Classification +3

Paper
Code

Cartography of Natural Language Processing for Social Good (NLP4SG): Searching for Definitions, Statistics and White Spots

no code implementations • ACL (NLP4PosImpact) 2021 • Paula Fortuna, Laura Pérez-Mayos, Ahmed Abura’Ed, Juan Soler-Company, Leo Wanner

Based on a list of keywords retrieved from the literature and revised in view of the task, we select from this corpus articles that can be considered to be on NLP4SG according to our definition and analyze them in terms of trends along the time line, etc.

Ethics Text Simplification

Paper
Add Code

The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results

1 code implementation • MSR (COLING) 2020 • Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner

As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.

Paper
Code

Targets and Aspects in Social Media Hate Speech

1 code implementation • ACL (WOAH) 2021 • Alexander Shvets, Paula Fortuna, Juan Soler, Leo Wanner

Mainstream research on hate speech focused so far predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse-grained hate speech categories.

Abusive Language

Paper
Code

A Case Study of NLG from Multimedia Data Sources: Generating Architectural Landmark Descriptions

no code implementations • ACL (WebNLG, INLG) 2020 • Simon Mille, Spyridon Symeonidis, Maria Rousi, Montserrat Marimon Felipe, Klearchos Stavrothanasopoulos, Petros Alvanitopoulos, Roberto Carlini Salguero, Jens Grivolla, Georgios Meditskos, Stefanos Vrochidis, Leo Wanner

In this paper, we present a pipeline system that generates architectural landmark descriptions using textual, visual and structured data.

Retrieval

Paper
Add Code

GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?

1 code implementation • 23 Feb 2024 • Yiping Jin, Leo Wanner, Alexander Shvets

A recent proposal in this direction is HateCheck, a suite for testing fine-grained model functionalities on synthesized data generated using templates of the kind "You are just a [slur] to me."

Hate Speech Detection Natural Language Inference +1

Paper
Code

User Identity Linkage in Social Media Using Linguistic and Social Interaction Features

no code implementations • 22 Aug 2023 • Despoina Chatzakou, Juan Soler-Company, Theodora Tsikrika, Leo Wanner, Stefanos Vrochidis, Ioannis Kompatsiaris

Social media users often hold several accounts in their effort to multiply the spread of their thoughts, ideas, and viewpoints.

Paper
Add Code

Towards Weakly-Supervised Hate Speech Classification Across Datasets

no code implementations • 4 May 2023 • Yiping Jin, Leo Wanner, Vishakha Laxman Kadam, Alexander Shvets

As pointed out by several scholars, current research on hate speech (HS) recognition is characterized by unsystematic data creation strategies and diverging annotation schemata.

text-classification Text Classification

Paper
Add Code

Missing Information, Unresponsive Authors, Experimental Flaws: The Impossibility of Assessing the Reproducibility of Previous Human Evaluations in NLP

no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang

We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.

Paper
Add Code

Multilingual Extraction and Categorization of Lexical Collocations with Graph-aware Transformers

no code implementations • *SEM (NAACL) 2022 • Luis Espinosa-Anke, Alexander Shvets, Alireza Mohammadshahi, James Henderson, Leo Wanner

Recognizing and categorizing lexical collocations in context is useful for language learning, dictionary compilation and downstream NLP.

Paper
Add Code

How much pretraining data do language models need to learn syntax?

no code implementations • EMNLP 2021 • Laura Pérez-Mayos, Miguel Ballesteros, Leo Wanner

This calls for a study of the impact of pretraining data size on the knowledge of the models.

Dependency Parsing Paraphrase Identification +1

Paper
Add Code

Assessing the Syntactic Capabilities of Transformer-based Multilingual Language Models

no code implementations • Findings (ACL) 2021 • Laura Pérez-Mayos, Alba Táboas García, Simon Mille, Leo Wanner

More specifically, we evaluate the syntactic generalization potential of the models on English and Spanish tests, comparing the syntactic abilities of monolingual and multilingual models on the same language (English), and of multilingual models on two different languages (English and Spanish).

Cross-Lingual Transfer

Paper
Add Code

Evaluating language models for the retrieval and categorization of lexical collocations

1 code implementation • EACL 2021 • Luis Espinosa Anke, Joan Codina-Filba, Leo Wanner

We first construct a dataset of apparitions of lexical collocations in context, categorized into 17 representative semantic categories.

Retrieval valid

Paper
Code

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

no code implementations • EACL 2021 • Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations.

Constituency Parsing POS +2

Paper
Add Code

Concept Extraction Using Pointer-Generator Networks

1 code implementation • 25 Aug 2020 • Alexander Shvets, Leo Wanner

Concept extraction is crucial for a number of downstream applications.

Concept Alignment

Paper
Code

ThemePro: A Toolkit for the Analysis of Thematic Progression

no code implementations • LREC 2020 • Monica Dominguez, Juan Soler, Leo Wanner

This paper introduces ThemePro, a toolkit for the automatic analysis of thematic progression.

Text Generation

Paper
Add Code

Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets

no code implementations • LREC 2020 • Paula Fortuna, Juan Soler, Leo Wanner

The field of the automatic detection of hate speech and related concepts has raised a lot of interest in the last years.

Paper
Add Code

The Second Multilingual Surface Realisation Shared Task (SR'19): Overview and Evaluation Results

no code implementations • WS 2019 • Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner

We report results from the SR{'}19 Shared Task, the second edition of a multilingual surface realisation task organised as part of the EMNLP{'}19 Workshop on Multilingual Surface Realisation.

Paper
Add Code

Teaching FORGe to Verbalize DBpedia Properties in Spanish

no code implementations • WS 2019 • Simon Mille, Stamatia Dasiopoulou, Beatriz Fisas, Leo Wanner

Statistical generators increasingly dominate the research in NLG.

Paper
Add Code

A Hierarchically-Labeled Portuguese Hate Speech Dataset

1 code implementation • WS 2019 • Paula Fortuna, Jo{\~a}o Rocha da Silva, Juan Soler-Company, Leo Wanner, S{\'e}rgio Nunes

Firstly, non-experts annotated the tweets with binary labels ({`}hate{'} vs. {`}no-hate{'}).

Word Embeddings

Paper
Code

Collocation Classification with Unsupervised Relation Vectors

1 code implementation • ACL 2019 • Luis Espinosa Anke, Steven Schockaert, Leo Wanner

Lexical relation classification is the task of predicting whether a certain relation holds between a given pair of words.

Classification General Classification +3

Paper
Code

Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation

no code implementations • WS 2018 • Simon Mille, Anja Belz, Bernd Bohnet, Leo Wanner

In this paper, we present the datasets used in the Shallow and Deep Tracks of the First Multilingual Surface Realisation Shared Task (SR{'}18).

Natural Language Understanding Text Generation

Paper
Add Code

Sentence Packaging in Text Generation from Semantic Graphs as a Community Detection Problem

no code implementations • WS 2018 • Alex Shvets, er, Simon Mille, Leo Wanner

An increasing amount of research tackles the challenge of text generation from abstract ontological or semantic structures, which are in their very nature potentially large connected graphs.

Community Detection Sentence +2

Paper
Add Code

The First Multilingual Surface Realisation Shared Task (SRâ18): Overview and Evaluation Results

no code implementations • WS 2018 • Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner

Question Answering Text Generation

Paper
Add Code

The First Multilingual Surface Realisation Shared Task (SR'18): Overview and Evaluation Results

no code implementations • WS 2018 • Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner

We report results from the SR{'}18 Shared Task, a new multilingual surface realisation task organised as part of the ACL{'}18 Workshop on Multilingual Surface Realisation.

Paper
Add Code

Compilation of Corpora for the Study of the Information Structure--Prosody Interface

no code implementations • LREC 2018 • Alicia Burga, M{\'o}nica Dom{\'\i}nguez, Mireia Farr{\'u}s, Leo Wanner

Paper
Add Code

Generation of a Spanish Artificial Collocation Error Corpus

no code implementations • LREC 2018 • Sara Rodr{\'\i}guez-Fern{\'a}ndez, Roberto Carlini, Leo Wanner

Grammatical Error Detection

Paper
Add Code

Revising the METU-Sabanc\i Turkish Treebank: An Exercise in Surface-Syntactic Annotation of Agglutinative Languages

no code implementations • WS 2017 • Alicia Burga, Alp {\"O}ktem, Leo Wanner

Language Modelling

Paper
Add Code

Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees

no code implementations • WS 2017 • Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz

We propose a shared task on multilingual Surface Realization, i. e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages.

Machine Translation POS +1

Paper
Add Code

A demo of FORGe: the Pompeu Fabra Open Rule-based Generator

no code implementations • WS 2017 • Simon Mille, Leo Wanner

This demo paper presents the multilingual deep sentence generator developed by the TALN group at Universitat Pompeu Fabra, implemented as a series of rule-based graph-transducers for the syntacticization of the input graphs, the resolution of morphological agreements, and the linearization of the trees.

Sentence Text Generation

Paper
Add Code

Automatic Extraction of Parallel Speech Corpora from Dubbed Movies

no code implementations • WS 2017 • Alp {\"O}ktem, Mireia Farr{\'u}s, Leo Wanner

This paper presents a methodology to extract parallel speech corpora based on any language pair from dubbed movies, together with an application framework in which some corresponding prosodic parameters are extracted.

Speech-to-Speech Translation Translation

Paper
Add Code

FORGe at SemEval-2017 Task 9: Deep sentence generation based on a sequence of graph transducers

no code implementations • SEMEVAL 2017 • Simon Mille, Roberto Carlini, Alicia Burga, Leo Wanner

We present the contribution of Universitat Pompeu Fabra{'}s NLP group to the SemEval Task 9. 2 (AMR-to-English Generation).

Sentence

Paper
Add Code

On the Relevance of Syntactic and Discourse Features for Author Profiling and Identification

no code implementations • EACL 2017 • Juan Soler-Company, Leo Wanner

The majority of approaches to author profiling and author identification focus mainly on lexical features, i. e., on the content of a text.

Feature Engineering

Paper
Add Code

Praat on the Web: An Upgrade of Praat for Semi-Automatic Speech Annotation

no code implementations • COLING 2016 • M{\'o}nica Dom{\'\i}nguez, Iv{\'a}n Latorre, Mireia Farr{\'u}s, Joan Codina-Filb{\`a}, Leo Wanner

This paper presents an implementation of the widely used speech analysis tool Praat as a web application with an extended functionality for feature annotation.

Paper
Add Code

An Automatic Prosody Tagger for Spontaneous Speech

1 code implementation • COLING 2016 • M{\'o}nica Dom{\'\i}nguez, Mireia Farr{\'u}s, Leo Wanner

Speech prosody is known to be central in advanced communication technologies.

Descriptive

Paper
Code

Extending WordNet with Fine-Grained Collocational Information via Supervised Distributional Learning

no code implementations • COLING 2016 • Luis Espinosa-Anke, Jose Camacho-Collados, Sara Rodr{\'\i}guez-Fern{\'a}ndez, Horacio Saggion, Leo Wanner

WordNet is probably the best known lexical resource in Natural Language Processing.

Machine Translation Semantic Textual Similarity +5

Paper
Add Code

A Neural Network Architecture for Multilingual Punctuation Generation

no code implementations • EMNLP 2016 • Miguel Ballesteros, Leo Wanner

Paper
Add Code

Semantics-Driven Recognition of Collocations Using Word Embeddings

no code implementations • ACL 2016 • Sara Rodr{\'\i}guez-Fern{\'a}ndez, Luis Espinosa-Anke, Roberto Carlini, Leo Wanner

Word Embeddings

Paper
Add Code

Towards Multiple Antecedent Coreference Resolution in Specialized Discourse

no code implementations • LREC 2016 • Alicia Burga, Sergio Cajal, Joan Codina-Filb{\`a}, Leo Wanner

Despite the popularity of coreference resolution as a research topic, the overwhelming majority of the work in this area focused so far on single antecedence coreference only.

Abstractive Text Summarization coreference-resolution

Paper
Add Code

Example-based Acquisition of Fine-grained Collocation Resources

no code implementations • LREC 2016 • Sara Rodr{\'\i}guez-Fern{\'a}ndez, Roberto Carlini, Luis Espinosa Anke, Leo Wanner

Collocations such as {``}heavy rain{''} or {``}make [a] decision{''}, are combinations of two elements where one (the base) is freely chosen, while the choice of the other (collocate) is restricted, depending on the base.

Word Embeddings

Paper
Add Code

A Semi-Supervised Approach for Gender Identification

no code implementations • LREC 2016 • Juan Soler, Leo Wanner

In most of the research studies on Author Profiling, large quantities of correctly labeled data are used to train the models.

Paper
Add Code

Classification of Lexical Collocation Errors in the Writings of Learners of Spanish

no code implementations • RANLP 2015 • Sara Rodr{\'\i}guez-Fern{\'a}ndez, Roberto Carlini, Leo Wanner

General Classification

Paper
Add Code

Towards a multi-layered dependency annotation of Finnish

no code implementations • WS 2015 • Alicia Burga, Simon Mille, Anton Granvik, Leo Wanner

Paper
Add Code

Visualizing Deep-Syntactic Parser Output

no code implementations • NAACL 2015 • Juan Soler-Company, Miguel Ballesteros, Bernd Bohnet, Simon Mille, Leo Wanner

Machine Translation

Paper
Add Code

Data-driven sentence generation with non-isomorphic trees

no code implementations • HLT 2015 • Bernd Bohnet, Leo Wanner, Simon Mille, Miguel Ballesteros

Abstractive Text Summarization Extractive Summarization +4

Paper
Add Code

Improving Collocation Correction by Ranking Suggestions Using Linguistic Knowledge

no code implementations • WS 2014 • Roberto Carlini, Joan Codina-Filba, Leo Wanner

Paper
Add Code

Deep-Syntactic Parsing

no code implementations • COLING 2014 • Miguel Ballesteros, Bernd Bohnet, Simon Mille, Leo Wanner

Machine Translation Text Simplification

Paper
Add Code

Classifiers for data-driven deep sentence generation

no code implementations • WS 2014 • Miguel Ballesteros, Simon Mille, Leo Wanner

Sentence Text Generation

Paper
Add Code

How to Use less Features and Reach Better Performance in Author Gender Identification

no code implementations • LREC 2014 • Juan Soler Company, Leo Wanner

Over the last years, author profiling in general and author gender identification in particular have become a popular research area due to their potential attractive applications that range from forensic investigations to online marketing studies.

Dimensionality Reduction Feature Engineering +1

Paper
Add Code

An Exercise in Reuse of Resources: Adapting General Discourse Coreference Resolution for Detecting Lexical Chains in Patent Documentation

no code implementations • LREC 2014 • Nadjet Bouayad-Agha, Alicia Burga, Gerard Casamayor, Joan Codina, Rogelio Nazar, Leo Wanner

The Stanford Coreference Resolution System (StCR) is a multi-pass, rule-based system that scored best in the CoNLL 2011 shared task on general discourse coreference resolution.

coreference-resolution Domain Adaptation +1