Search Results for author: Leo Wanner

Found 58 papers, 10 papers with code

CollFrEn: Rich Bilingual English–French Collocation Resource

1 code implementation COLING (MWE) 2020 Beatriz Fisas, Luis Espinosa Anke, Joan Codina-Filbá, Leo Wanner

Collocations in the sense of idiosyncratic lexical co-occurrences of two syntactically bound words traditionally pose a challenge to language learners and many Natural Language Processing (NLP) applications alike.

Machine Translation Relation Classification +3

The Third Multilingual Surface Realisation Shared Task (SR’20): Overview and Evaluation Results

1 code implementation MSR (COLING) 2020 Simon Mille, Anya Belz, Bernd Bohnet, Thiago castro Ferreira, Yvette Graham, Leo Wanner

As in SR’18 and SR’19, the shared task comprised two tracks: (1) a Shallow Track where the inputs were full UD structures with word order information removed and tokens lemmatised; and (2) a Deep Track where additionally, functional words and morphological information were removed.

Targets and Aspects in Social Media Hate Speech

1 code implementation ACL (WOAH) 2021 Alexander Shvets, Paula Fortuna, Juan Soler, Leo Wanner

Mainstream research on hate speech focused so far predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse-grained hate speech categories.

Abusive Language

Cartography of Natural Language Processing for Social Good (NLP4SG): Searching for Definitions, Statistics and White Spots

no code implementations ACL (NLP4PosImpact) 2021 Paula Fortuna, Laura Pérez-Mayos, Ahmed Abura’Ed, Juan Soler-Company, Leo Wanner

Based on a list of keywords retrieved from the literature and revised in view of the task, we select from this corpus articles that can be considered to be on NLP4SG according to our definition and analyze them in terms of trends along the time line, etc.

Ethics Text Simplification

Disentangling Hate Across Target Identities

1 code implementation14 Oct 2024 Yiping Jin, Leo Wanner, Aneesh Moideen Koya

Experiments on popular industrial and academic models demonstrate that HS detectors assign a higher hatefulness score merely based on the mention of specific target identities.

GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?

1 code implementation23 Feb 2024 Yiping Jin, Leo Wanner, Alexander Shvets

A recent proposal in this direction is HateCheck, a suite for testing fine-grained model functionalities on synthesized data generated using templates of the kind "You are just a [slur] to me."

Hate Speech Detection Natural Language Inference +1

User Identity Linkage in Social Media Using Linguistic and Social Interaction Features

no code implementations22 Aug 2023 Despoina Chatzakou, Juan Soler-Company, Theodora Tsikrika, Leo Wanner, Stefanos Vrochidis, Ioannis Kompatsiaris

Social media users often hold several accounts in their effort to multiply the spread of their thoughts, ideas, and viewpoints.

Towards Weakly-Supervised Hate Speech Classification Across Datasets

no code implementations4 May 2023 Yiping Jin, Leo Wanner, Vishakha Laxman Kadam, Alexander Shvets

As pointed out by several scholars, current research on hate speech (HS) recognition is characterized by unsystematic data creation strategies and diverging annotation schemata.

text-classification Text Classification

Assessing the Syntactic Capabilities of Transformer-based Multilingual Language Models

no code implementations Findings (ACL) 2021 Laura Pérez-Mayos, Alba Táboas García, Simon Mille, Leo Wanner

More specifically, we evaluate the syntactic generalization potential of the models on English and Spanish tests, comparing the syntactic abilities of monolingual and multilingual models on the same language (English), and of multilingual models on two different languages (English and Spanish).

Cross-Lingual Transfer

Evaluating language models for the retrieval and categorization of lexical collocations

1 code implementation EACL 2021 Luis Espinosa Anke, Joan Codina-Filba, Leo Wanner

We first construct a dataset of apparitions of lexical collocations in context, categorized into 17 representative semantic categories.

Retrieval valid

On the Evolution of Syntactic Information Encoded by BERT's Contextualized Representations

no code implementations EACL 2021 Laura Pérez-Mayos, Roberto Carlini, Miguel Ballesteros, Leo Wanner

The adaptation of pretrained language models to solve supervised tasks has become a baseline in NLP, and many recent works have focused on studying how linguistic information is encoded in the pretrained sentence representations.

Constituency Parsing POS +2

Concept Extraction Using Pointer-Generator Networks

1 code implementation25 Aug 2020 Alexander Shvets, Leo Wanner

Concept extraction is crucial for a number of downstream applications.

Concept Alignment

Toxic, Hateful, Offensive or Abusive? What Are We Really Classifying? An Empirical Analysis of Hate Speech Datasets

no code implementations LREC 2020 Paula Fortuna, Juan Soler, Leo Wanner

The field of the automatic detection of hate speech and related concepts has raised a lot of interest in the last years.

The Second Multilingual Surface Realisation Shared Task (SR'19): Overview and Evaluation Results

no code implementations WS 2019 Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Leo Wanner

We report results from the SR{'}19 Shared Task, the second edition of a multilingual surface realisation task organised as part of the EMNLP{'}19 Workshop on Multilingual Surface Realisation.

Collocation Classification with Unsupervised Relation Vectors

1 code implementation ACL 2019 Luis Espinosa Anke, Steven Schockaert, Leo Wanner

Lexical relation classification is the task of predicting whether a certain relation holds between a given pair of words.

Classification General Classification +3

Sentence Packaging in Text Generation from Semantic Graphs as a Community Detection Problem

no code implementations WS 2018 Alex Shvets, er, Simon Mille, Leo Wanner

An increasing amount of research tackles the challenge of text generation from abstract ontological or semantic structures, which are in their very nature potentially large connected graphs.

Community Detection Sentence +2

Underspecified Universal Dependency Structures as Inputs for Multilingual Surface Realisation

no code implementations WS 2018 Simon Mille, Anja Belz, Bernd Bohnet, Leo Wanner

In this paper, we present the datasets used in the Shallow and Deep Tracks of the First Multilingual Surface Realisation Shared Task (SR{'}18).

Natural Language Understanding Text Generation

The First Multilingual Surface Realisation Shared Task (SR'18): Overview and Evaluation Results

no code implementations WS 2018 Simon Mille, Anja Belz, Bernd Bohnet, Yvette Graham, Emily Pitler, Leo Wanner

We report results from the SR{'}18 Shared Task, a new multilingual surface realisation task organised as part of the ACL{'}18 Workshop on Multilingual Surface Realisation.

Shared Task Proposal: Multilingual Surface Realization Using Universal Dependency Trees

no code implementations WS 2017 Simon Mille, Bernd Bohnet, Leo Wanner, Anja Belz

We propose a shared task on multilingual Surface Realization, i. e., on mapping unordered and uninflected universal dependency trees to correctly ordered and inflected sentences in a number of languages.

Machine Translation POS +1

A demo of FORGe: the Pompeu Fabra Open Rule-based Generator

no code implementations WS 2017 Simon Mille, Leo Wanner

This demo paper presents the multilingual deep sentence generator developed by the TALN group at Universitat Pompeu Fabra, implemented as a series of rule-based graph-transducers for the syntacticization of the input graphs, the resolution of morphological agreements, and the linearization of the trees.

Sentence Text Generation

FORGe at SemEval-2017 Task 9: Deep sentence generation based on a sequence of graph transducers

no code implementations SEMEVAL 2017 Simon Mille, Roberto Carlini, Alicia Burga, Leo Wanner

We present the contribution of Universitat Pompeu Fabra{'}s NLP group to the SemEval Task 9. 2 (AMR-to-English Generation).

Sentence

Automatic Extraction of Parallel Speech Corpora from Dubbed Movies

no code implementations WS 2017 Alp {\"O}ktem, Mireia Farr{\'u}s, Leo Wanner

This paper presents a methodology to extract parallel speech corpora based on any language pair from dubbed movies, together with an application framework in which some corresponding prosodic parameters are extracted.

Speech-to-Speech Translation Translation

On the Relevance of Syntactic and Discourse Features for Author Profiling and Identification

no code implementations EACL 2017 Juan Soler-Company, Leo Wanner

The majority of approaches to author profiling and author identification focus mainly on lexical features, i. e., on the content of a text.

Author Profiling Feature Engineering

Praat on the Web: An Upgrade of Praat for Semi-Automatic Speech Annotation

no code implementations COLING 2016 M{\'o}nica Dom{\'\i}nguez, Iv{\'a}n Latorre, Mireia Farr{\'u}s, Joan Codina-Filb{\`a}, Leo Wanner

This paper presents an implementation of the widely used speech analysis tool Praat as a web application with an extended functionality for feature annotation.

Towards Multiple Antecedent Coreference Resolution in Specialized Discourse

no code implementations LREC 2016 Alicia Burga, Sergio Cajal, Joan Codina-Filb{\`a}, Leo Wanner

Despite the popularity of coreference resolution as a research topic, the overwhelming majority of the work in this area focused so far on single antecedence coreference only.

Abstractive Text Summarization coreference-resolution

Example-based Acquisition of Fine-grained Collocation Resources

no code implementations LREC 2016 Sara Rodr{\'\i}guez-Fern{\'a}ndez, Roberto Carlini, Luis Espinosa Anke, Leo Wanner

Collocations such as {``}heavy rain{''} or {``}make [a] decision{''}, are combinations of two elements where one (the base) is freely chosen, while the choice of the other (collocate) is restricted, depending on the base.

Word Embeddings

A Semi-Supervised Approach for Gender Identification

no code implementations LREC 2016 Juan Soler, Leo Wanner

In most of the research studies on Author Profiling, large quantities of correctly labeled data are used to train the models.

Author Profiling

How to Use less Features and Reach Better Performance in Author Gender Identification

no code implementations LREC 2014 Juan Soler Company, Leo Wanner

Over the last years, author profiling in general and author gender identification in particular have become a popular research area due to their potential attractive applications that range from forensic investigations to online marketing studies.

Author Profiling Dimensionality Reduction +2

Cannot find the paper you are looking for? You can Submit a new open access paper.