no code implementations • SemEval (NAACL) 2022 • Jeremy Barnes, Laura Oberlaender, Enrica Troiano, Andrey Kutuzov, Jan Buchmann, Rodrigo Agerri, Lilja Øvrelid, Erik Velldal
In this paper, we introduce the first SemEval shared task on Structured Sentiment Analysis, for which participants are required to predict all sentiment graphs in a text, where a single sentiment graph is composed of a sentiment holder, target, expression and polarity.
1 code implementation • LREC 2022 • Gorka Urbizu, Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri, Aitor Soroa
Natural Language Understanding (NLU) technology has improved significantly over the last few years and multitask benchmarks such as GLUE are key to evaluate this improvement in a robust and general way.
no code implementations • FEVER (ACL) 2022 • Blanca Calvo Figueras, Montse Oller, Rodrigo Agerri
The influence of fake news in the perception of reality has become a mainstream topic in the last years due to the fast propagation of misleading information.
1 code implementation • Findings (EMNLP) 2021 • Iker García-Ferrero, Rodrigo Agerri, German Rigau
In the last few years, several methods have been proposed to build meta-embeddings.
1 code implementation • 18 Oct 2024 • Blanca Calvo Figueras, Rodrigo Agerri
The development of Large Language Models (LLMs) has brought impressive performances on mitigation strategies against misinformation, such as counterargument generation.
1 code implementation • 7 Oct 2024 • Ekaterina Sviridova, Anar Yeginbergen, Ainara Estarrona, Elena Cabrio, Serena Villata, Rodrigo Agerri
In this paper, we follow this direction, and we present, to the best of our knowledge, the first multilingual dataset for Medical Question Answering where correct and incorrect diagnoses for a clinical case are enriched with a natural language explanation written by doctors.
1 code implementation • 4 Jul 2024 • Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri
Contrary to previous work, we show that for Argument Mining data transfer obtains better results than model-transfer and that fine-tuning outperforms few-shot methods.
1 code implementation • 21 Jun 2024 • Irune Zubiaga, Aitor Soroa, Rodrigo Agerri
This paper proposes a novel approach to evaluate Counter Narrative (CN) generation using a Large Language Model (LLM) as an evaluator.
no code implementations • 12 Jun 2024 • Joseba Fernandez de Landa, Rodrigo Agerri
Social media users express their political preferences via interaction with other users, by spontaneous declarations or by participation in communities within the network.
no code implementations • 11 Apr 2024 • Iker García-Ferrero, Rodrigo Agerri, Aitziber Atutxa Salazar, Elena Cabrio, Iker de la Iglesia, Alberto Lavelli, Bernardo Magnini, Benjamin Molinet, Johana Ramirez-Romero, German Rigau, Jose Maria Villa-Gonzalez, Serena Villata, Andrea Zaninello
While these LLMs display competitive performance on automated medical texts benchmarks, they have been pre-trained and evaluated with a focus on a single language (English mostly).
1 code implementation • 10 Apr 2024 • Elisa Sanchez-Bayona, Rodrigo Agerri
Metaphors, although occasionally unperceived, are ubiquitous in our everyday language.
no code implementations • 8 Apr 2024 • Iñigo Alonso, Maite Oronoz, Rodrigo Agerri
So far the benchmark is available in four languages, but we hope that this work may encourage further development to other languages.
1 code implementation • 25 Mar 2024 • Olia Toporkov, Rodrigo Agerri
We experiment with seven languages of different morphological complexity, namely, English, Spanish, Basque, Russian, Czech, Turkish and Polish, using multilingual and language-specific pre-trained masked language encoder-only models as a backbone to build our lemmatizers.
no code implementations • 14 Mar 2024 • Jaione Bengoetxea, Yi-Ling Chung, Marco Guerini, Rodrigo Agerri
Being a parallel corpus, also with respect to the original English CONAN, it allows to perform novel research on multilingual and crosslingual automatic generation of CNs.
1 code implementation • 1 Dec 2023 • Iakes Goenaga, Aitziber Atutxa, Koldo Gojenola, Maite Oronoz, Rodrigo Agerri
Comprehensive experimentation with language models for Spanish shows that sometimes multilingual models fare better than monolingual ones, even outperforming models which have been adapted to the medical domain.
no code implementations • 20 Nov 2023 • Maxime Masson, Rodrigo Agerri, Christian Sallaberry, Marie-Noelle Bessagnet, Annig Le Parc Lacayrelle, Philippe Roose
Extensive experimentation on a newly collected and annotated multilingual (French, English, and Spanish) dataset composed of tourism-related tweets shows that current few-shot learning techniques allow us to obtain competitive results for all three tasks with very little annotation data: 5 tweets per label (15 in total) for Sentiment Analysis, 10% of the tweets for location detection (around 160) and 13% (200 approx.)
1 code implementation • 5 Oct 2023 • Oscar Sainz, Iker García-Ferrero, Rodrigo Agerri, Oier Lopez de Lacalle, German Rigau, Eneko Agirre
In this paper, we propose GoLLIE (Guideline-following Large Language Model for IE), a model able to improve zero-shot results on unseen IE tasks by virtue of being fine-tuned to comply with annotation guidelines.
Ranked #1 on Zero-shot Named Entity Recognition (NER) on HarveyNER (using extra training data)
1 code implementation • 9 Jun 2023 • Rodrigo Agerri, Iñigo Alonso, Aitziber Atutxa, Ander Berrondo, Ainara Estarrona, Iker Garcia-Ferrero, Iakes Goenaga, Koldo Gojenola, Maite Oronoz, Igor Perez-Tejedor, German Rigau, Anar Yeginbergenova
Providing high quality explanations for AI predictions based on machine learning is a challenging and complex task.
1 code implementation • 27 Apr 2023 • Nayla Escribano, German Rigau, Rodrigo Agerri
Detecting and normalizing temporal expressions is an essential step for many NLP tasks.
no code implementations • 1 Feb 2023 • Olia Toporkov, Rodrigo Agerri
Given that the process to obtain a lemma from an inflected word can be explained by looking at its morphosyntactic category, including fine-grained morphosyntactic information to train contextual lemmatizers has become common practice, without considering whether that is the optimum in terms of downstream performance.
1 code implementation • 25 Jan 2023 • Anar Yeginbergen, Rodrigo Agerri
Nowadays the medical domain is receiving more and more attention in applications involving Artificial Intelligence as clinicians decision-making is increasingly dependent on dealing with enormous amounts of unstructured textual data.
2 code implementations • 20 Dec 2022 • Iker García-Ferrero, Rodrigo Agerri, German Rigau
In the absence of readily available labeled data for a given sequence labeling task and language, annotation projection has been proposed as one of the possible strategies to automatically generate annotated data.
Ranked #1 on Cross-Lingual NER on MasakhaNER2.0 (Hausa metric)
1 code implementation • 16 Dec 2022 • Rodrigo Agerri, Eneko Agirre
Given the impact of language models on the field of Natural Language Processing, a number of Spanish encoder-only masked language models (aka BERTs) have been trained and released.
4 code implementations • 23 Oct 2022 • Iker García-Ferrero, Rodrigo Agerri, German Rigau
Zero-resource cross-lingual transfer approaches aim to apply supervised models from a source language to unlabelled target languages.
Ranked #1 on Cross-Lingual NER on CoNLL Spanish
no code implementations • 19 Oct 2022 • Elisa Sanchez-Bayona, Rodrigo Agerri
The lack of wide coverage datasets annotated with everyday metaphorical expressions for languages other than English is striking.
no code implementations • 11 Oct 2022 • Joseba Fernandez de Landa, Rodrigo Agerri
The large majority of the research performed on stance detection has been focused on developing more or less sophisticated text classification systems, even when many benchmarks are based on social network data such as Twitter.
1 code implementation • LREC 2022 • Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri
Parliamentary transcripts provide a valuable resource to understand the reality and know about the most important facts that occur over time in our societies.
no code implementations • 15 Mar 2022 • Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa
For instance, 66% of documents are rated as high-quality for EusCrawl, in contrast with <33% for both mC4 and CC100.
1 code implementation • EMNLP (ArgMining) 2021 • Yi-Ling Chung, Marco Guerini, Rodrigo Agerri
The growing interest in employing counter narratives for hatred intervention brings with it a focus on dataset creation and automation strategies.
no code implementations • 28 Jan 2021 • Elena Zotova, Rodrigo Agerri, German Rigau
While interactions in social media such as Twitter occur in many natural languages, research on stance detection (the position or attitude expressed with respect to a specific topic) within the Natural Language Processing field has largely been done for English.
no code implementations • LREC 2020 • Elena Zotova, Rodrigo Agerri, Manuel Nu{\~n}ez, German Rigau
The TW-10 referendum Dataset released at IberEval 2018 is a previous effort to provide multilingual stance-annotated data in Catalan and Spanish.
1 code implementation • LREC 2020 • Rodrigo Agerri, Iñaki San Vicente, Jon Ander Campos, Ander Barrena, Xabier Saralegi, Aitor Soroa, Eneko Agirre
This is suboptimal as, for many languages, the models have been trained on smaller (or lower quality) corpora.
1 code implementation • 31 Mar 2020 • Elena Zotova, Rodrigo Agerri, Manuel Nuñez, German Rigau
The TW-10 Referendum Dataset released at IberEval 2018 is a previous effort to provide multilingual stance-annotated data in Catalan and Spanish.
2 code implementations • 17 Jan 2020 • Iker García-Ferrero, Rodrigo Agerri, German Rigau
This paper presents a new technique for creating monolingual and cross-lingual meta-embeddings.
no code implementations • SEMEVAL 2019 • Rodrigo Agerri
In this paper we describe our participation to the Hyperpartisan News Detection shared task at SemEval 2019.
no code implementations • 28 Jan 2019 • Rodrigo Agerri, German Rigau
In this research note we present a language independent system to model Opinion Target Extraction (OTE) as a sequence labelling task.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
no code implementations • 28 Sep 2018 • Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri
Crawled data is processed by means of the EliXa Sentiment Analysis system.
no code implementations • SEMEVAL 2015 • Iñaki San Vicente, Xabier Saralegi, Rodrigo Agerri
This paper presents a supervised Aspect Based Sentiment Analysis (ABSA) system.
Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1
no code implementations • 6 Feb 2017 • Iñaki San Vicente, Rodrigo Agerri, German Rigau
This paper presents a simple, robust and (almost) unsupervised dictionary-based method, qwn-ppv (Q-WordNet as Personalized PageRanking Vector) to automatically generate polarity lexicons.
no code implementations • 2 Feb 2017 • Egoitz Laparra, Rodrigo Agerri, Itziar Aldabe, German Rigau
In this paper we present an approach to extract ordered timelines of events, their participants, locations and times from a set of multilingual and cross-lingual data sources.
1 code implementation • 31 Jan 2017 • Rodrigo Agerri, German Rigau
Finally, the results show that our emphasis on clustering features is crucial to develop robust out-of-domain models.
Ranked #61 on Named Entity Recognition (NER) on CoNLL 2003 (English)
no code implementations • LREC 2014 • Rodrigo Agerri, Josu Bermudez, German Rigau
IXA pipeline is a modular set of Natural Language Processing tools (or pipes) which provide easy access to NLP technology.
no code implementations • LREC 2014 • Isa Maks, Ruben Izquierdo, Francesca Frontini, Rodrigo Agerri, Piek Vossen, Andoni Azpeitia
In this paper we focus on the creation of general-purpose (as opposed to domain-specific) polarity lexicons in five languages: French, Italian, Dutch, English and Spanish using WordNet propagation.
no code implementations • LREC 2012 • Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Andy Way, Panayota Georgakopoulou, Martin Volk
Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of subtitle production process.