Search Results for author: Horacio Saggion

Found 68 papers, 12 papers with code

Identification of complex words and passages in medical documents in French

no code implementations JEP/TALN/RECITAL 2022 Kim Cheng SHEANG, Anaïs Koptient, Natalia Grabar, Horacio Saggion

Nous proposons de travail sur l’identification de mots et passages complexes dans les documents biomédicaux en français.

ALEXSIS: A Dataset for Lexical Simplification in Spanish

1 code implementation LREC 2022 Daniel Ferrés, Horacio Saggion

Lexical Simplification is the process of reducing the lexical complexity of a text by replacing difficult words with easier to read (or understand) expressions while preserving the original information and meaning.

Lexical Simplification

Controllable Sentence Simplification with a Unified Text-to-Text Transfer Transformer

1 code implementation INLG (ACL) 2021 Kim Cheng SHEANG, Horacio Saggion

Recently, a large pre-trained language model called T5 (A Unified Text-to-Text Transfer Transformer) has achieved state-of-the-art performance in many NLP tasks.

Language Modelling Sentence +1

Syntax-aware Transformers for Neural Machine Translation: The Case of Text to Sign Gloss Translation

1 code implementation RANLP (BUCC) 2021 Santiago Egea Gómez, Euan McGill, Horacio Saggion

It is well-established that the preferred mode of communication of the deaf and hard of hearing (DHH) community are Sign Languages (SLs), but they are considered low resource languages where natural language processing technologies are of concern.

Machine Translation Translation +1

Challenges with Sign Language Datasets for Sign Language Recognition and Translation

no code implementations LREC 2022 Mirella De Sisto, Vincent Vandeghinste, Santiago Egea Gómez, Mathieu De Coster, Dimitar Shterionov, Horacio Saggion

Furthermore, we propose a framework to address the lack of standardization at format level, unify the available resources and facilitate SL research for different languages.

Sign Language Recognition Translation

Translating Spanish into Spanish Sign Language: Combining Rules and Data-driven Approaches

no code implementations loresmt (COLING) 2022 Luis Chiruzzo, Euan McGill, Santiago Egea-Gómez, Horacio Saggion

This paper presents a series of experiments on translating between spoken Spanish and Spanish Sign Language glosses (LSE), including enriching Neural Machine Translation (NMT) systems with linguistic features, and creating synthetic data to pretrain and later on finetune a neural translation model.

Machine Translation NMT +1

Exploring the limits of a base BART for multi-document summarization in the medical domain

no code implementations sdp (COLING) 2022 Ishmael Obonyo, Silvia Casola, Horacio Saggion

This paper is a description of our participation in the Multi-document Summarization for Literature Review (MSLR) Shared Task, in which we explore summarization models to create an automatic review of scientific results.

Document Summarization Multi-Document Summarization

A Novel Dataset for Financial Education Text Simplification in Spanish

no code implementations15 Dec 2023 Nelson Perez-Rojas, Saul Calderon-Ramirez, Martin Solis-Salazar, Mario Romero-Sandoval, Monica Arias-Monge, Horacio Saggion

Text simplification, crucial in natural language processing, aims to make texts more comprehensible, particularly for specific groups like visually impaired Spanish speakers, a less-represented language in this field.

Data Augmentation Sentence +1

Creating a silver standard for patent simplification

1 code implementation24 Oct 2023 Silvia Casola, Alberto Lavelli, Horacio Saggion

Patents are legal documents that aim at protecting inventions on the one hand and at making technical knowledge circulate on the other.

Information Retrieval Retrieval

Multilingual Controllable Transformer-Based Lexical Simplification

1 code implementation5 Jul 2023 Kim Cheng SHEANG, Horacio Saggion

Moreover, further evaluation of our approach on the part of the recent TSAR-2022 multilingual LS shared-task dataset shows that our model performs competitively when compared with the participating systems for English LS and even outperforms the GPT-3 model on several metrics.

Lexical Simplification Reading Comprehension

Verifying the Robustness of Automatic Credibility Assessment

1 code implementation14 Mar 2023 Piotr Przybyła, Alexander Shvets, Horacio Saggion

Text classification methods have been widely investigated as a way to detect content of low credibility: fake news, social media bots, propaganda, etc.

Misinformation text-classification +1

Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification

no code implementations6 Feb 2023 Horacio Saggion, Sanja Štajner, Daniel Ferrés, Kim Cheng SHEANG, Matthew Shardlow, Kai North, Marcos Zampieri

We report findings of the TSAR-2022 shared task on multilingual lexical simplification, organized as part of the Workshop on Text Simplification, Accessibility, and Readability TSAR-2022 held in conjunction with EMNLP 2022.

Lexical Simplification Text Simplification

Controllable Lexical Simplification for English

1 code implementation6 Feb 2023 Kim Cheng SHEANG, Daniel Ferrés, Horacio Saggion

Fine-tuning Transformer-based approaches have recently shown exciting results on sentence simplification task.

Lexical Simplification Sentence

Lexical Simplification Benchmarks for English, Portuguese, and Spanish

2 code implementations12 Sep 2022 Sanja Stajner, Daniel Ferres, Matthew Shardlow, Kai North, Marcos Zampieri, Horacio Saggion

To showcase the usability of the dataset, we adapt two state-of-the-art lexical simplification systems with differing architectures (neural vs.\ non-neural) to all three languages (English, Spanish, and Brazilian Portuguese) and evaluate their performances on our new dataset.

Lexical Simplification

LaSTUS/TALN at TRAC - 2020 Trolling, Aggression and Cyberbullying

no code implementations LREC 2020 L{\"u}tfiye Seda Mut Alt{\i}n, Alex Bravo, Horacio Saggion

This paper presents the participation of the LaSTUS/TALN team at TRAC-2020 Trolling, Aggression and Cyberbullying shared task.

Transferring Knowledge from Discourse to Arguments: A Case Study with Scientific Abstracts

no code implementations WS 2019 Pablo Accuosto, Horacio Saggion

In this work we propose to leverage resources available with discourse-level annotations to facilitate the identification of argumentative components and relations in scientific texts, which has been recognized as a particularly challenging task.

Argument Mining Discourse Parsing +1

Recognizing Musical Entities in User-generated Content

1 code implementation1 Apr 2019 Lorenzo Porcaro, Horacio Saggion

Recognizing Musical Entities is important for Music Information Retrieval (MIR) since it can improve the performance of several tasks such as music recommendation, genre classification or artist similarity.

General Classification Genre classification +4

Interpretable Emoji Prediction via Label-Wise Attention LSTMs

no code implementations EMNLP 2018 Francesco Barbieri, Luis Espinosa-Anke, Jose Camacho-Collados, Steven Schockaert, Horacio Saggion

Human language has evolved towards newer forms of communication such as social media, where emojis (i. e., ideograms bearing a visual meaning) play a key role.

Emotion Recognition Information Retrieval +3

LaSTUS/TALN at Complex Word Identification (CWI) 2018 Shared Task

no code implementations WS 2018 Ahmed Abura{'}ed, Horacio Saggion

The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience.

Complex Word Identification Lexical Simplification +2

Exploring Emoji Usage and Prediction Through a Temporal Variation Lens

no code implementations2 May 2018 Francesco Barbieri, Luis Marujo, Pradeep Karuturi, William Brendel, Horacio Saggion

The frequent use of Emojis on social media platforms has created a new form of multimodal social interaction.

Multimodal Emoji Prediction

1 code implementation NAACL 2018 Francesco Barbieri, Miguel Ballesteros, Francesco Ronzano, Horacio Saggion

Emojis are small images that are commonly included in social media text messages.

What Sentence are you Referring to and Why? Identifying Cited Sentences in Scientific Literature

no code implementations RANLP 2017 Ahmed Abura{'}ed, Luis Chiruzzo, Horacio Saggion

Current citation networks, which link papers by citation relationships (reference and citing paper), are useful to quantitatively understand the value of a piece of scientific work, however they are limited in that they do not provide information about what specific part of the reference paper the citing paper is referring to.

Sentence

An Adaptable Lexical Simplification Architecture for Major Ibero-Romance Languages

no code implementations WS 2017 Daniel Ferr{\'e}s, Horacio Saggion, Xavier G{\'o}mez Guinovart

Lexical Simplification is the task of reducing the lexical complexity of textual documents by replacing difficult words with easier to read (or understand) expressions while preserving the original meaning.

Lexical Simplification Text Simplification

Are Emojis Predictable?

3 code implementations EACL 2017 Francesco Barbieri, Miguel Ballesteros, Horacio Saggion

Emojis are ideograms which are naturally combined with plain text to visually complement or condense the meaning of a message.

Natural Language Processing for Intelligent Access to Scientific Information

no code implementations COLING 2016 Horacio Saggion, Francesco Ronzano

Summarization techniques can reduce the size of long papers to their essential content or automatically generate state-of-the-art-reviews.

Natural Language Inference Question Answering

A Multi-Layered Annotated Corpus of Scientific Papers

no code implementations LREC 2016 Beatriz Fisas, Francesco Ronzano, Horacio Saggion

In addition, a grade is allocated to each sentence according to its relevance for being included in a summary. To the best of our knowledge, this complex, multi-layered collection of annotations and metadata characterizing a set of research papers had never been grouped together before in one corpus and therefore constitutes a newer, richer resource with respect to those currently available in the field.

Sentence

Modelling Irony in Twitter: Feature Analysis and Evaluation

no code implementations LREC 2014 Francesco Barbieri, Horacio Saggion

We propose in this paper a new set of experiments to assess the relevance of the features included in our model.

Can Numerical Expressions Be Simpler? Implementation and Demostration of a Numerical Simplification System for Spanish

no code implementations LREC 2014 Susana Bautista, Horacio Saggion

Information in newspapers is often showed in the form of numerical expressions which present comprehension problems for many people, including people with disabilities, illiteracy or lack of access to advanced technology.

Text Simplification

Creating Summarization Systems with SUMMA

no code implementations LREC 2014 Horacio Saggion

Automatic text summarization, the reduction of a text to its essential content is fundamental for an on-line information society.

Sentence Text Summarization

The CONCISUS Corpus of Event Summaries

no code implementations LREC 2012 Horacio Saggion, S Szasz, ra

In this paper we present a comparable corpus in Spanish and English for the study of cross-lingual information extraction and summarization: the CONCISUS Corpus.

Text Generation Text Summarization

Cannot find the paper you are looking for? You can Submit a new open access paper.