Search Results for author: ro

Found 182 papers, 5 papers with code

Neural Multi-task Text Normalization and Sanitization with Pointer-Generator

no code implementations WS 2020 Hoang Nguyen, S Cavallari, ro

Its generator effectively captures linguistic context during normalization and sanitization while its pointer dynamically preserves the entities that are generally missing in the vocabulary.

Dialogue Generation Information Retrieval +1

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

no code implementations CL 2020 Ra{\'u}l V{\'a}zquez, Aless Raganato, ro, Mathias Creutz, J{\"o}rg Tiedemann

In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks.

Machine Translation Sentence +2

Supervised Event Coding from Text Written in Arabic: Introducing Hadath

no code implementations LREC 2020 Javier Osorio, Alej Reyes, Alej Beltr{\'a}n, ro, Atal Ahmadzai

This article introduces Hadath, a supervised protocol for coding event data from text written in Arabic.

Are Word Embeddings Really a Bad Fit for the Estimation of Thematic Fit?

no code implementations LREC 2020 Emmanuele Chersoni, Ludovica Pannitto, Enrico Santus, Aless Lenci, ro, Chu-Ren Huang

While neural embeddings represent a popular choice for word representation in a wide variety of NLP tasks, their usage for thematic fit modeling has been limited, as they have been reported to lag behind syntax-based count models.

Word Embeddings

An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems

1 code implementation LREC 2020 Aless Raganato, ro, Yves Scherrer, J{\"o}rg Tiedemann

Lexical ambiguity is one of the many challenging linguistic phenomena involved in translation, i. e., translating an ambiguous word with its correct sense.

Machine Translation Translation +1

Is the Red Square Big? MALeViC: Modeling Adjectives Leveraging Visual Contexts

no code implementations IJCNLP 2019 S Pezzelle, ro, Raquel Fern{\'a}ndez

This work aims at modeling how the meaning of gradable adjectives of size ({`}big{'}, {`}small{'}) can be learned from visually-grounded contexts.

Big Generalizations with Small Data: Exploring the Role of Training Samples in Learning Adjectives of Size

no code implementations WS 2019 S Pezzelle, ro, Raquel Fern{\'a}ndez

In this paper, we experiment with a recently proposed visual reasoning task dealing with quantities {--} modeling the multimodal, contextually-dependent meaning of size adjectives ({`}big{'}, {`}small{'}) {--} and explore the impact of varying the training data on the learning behavior of a state-of-art system.

Small Data Image Classification Visual Reasoning

Demo Application for LETO: Learning Engine Through Ontologies

no code implementations RANLP 2019 Suilan Estevez-Velarde, Andr{\'e}s Montoyo, Yudivian Almeida-Cruz, Yoan Guti{\'e}rrez, Alej Piad-Morffis, ro, Rafael Mu{\~n}oz

The massive amount of multi-formatted information available on the Web necessitates the design of software systems that leverage this information to obtain knowledge that is valid and useful.

valid

A Neural Network Component for Knowledge-Based Semantic Representations of Text

no code implementations RANLP 2019 Alej Piad-Morffis, ro, Rafael Mu{\~n}oz, Yoan Guti{\'e}rrez, Yudivian Almeida-Cruz, Suilan Estevez-Velarde, Andr{\'e}s Montoyo

SNNs can be trained to encode explicit semantic knowledge from an arbitrary knowledge base, and can subsequently be combined with other deep learning architectures.

Opinion Mining

Give It a Shot: Few-shot Learning to Normalize ADR Mentions in Social Media Posts

no code implementations WS 2019 Emmanouil Manousogiannis, Sepideh Mesbah, Aless Bozzon, ro, Selene Baez, Robert Jan Sips

This paper describes the system that team MYTOMORROWS-TU DELFT developed for the 2019 Social Media Mining for Health Applications (SMM4H) Shared Task 3, for the end-to-end normalization of ADR tweet mentions to their corresponding MEDDRA codes.

Entity Linking Few-Shot Learning +1

Distributional Semantics Meets Construction Grammar. towards a Unified Usage-Based Model of Grammar and Meaning

no code implementations WS 2019 Giulia Rambelli, Emmanuele Chersoni, Philippe Blache, Chu-Ren Huang, Aless Lenci, ro

In this paper, we propose a new type of semantic representation of Construction Grammar that combines constructions with the vector representations used in Distributional Semantics.

An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation

no code implementations WS 2019 Aless Raganato, ro, Ra{\'u}l V{\'a}zquez, Mathias Creutz, J{\"o}rg Tiedemann

In this paper, we explore a multilingual translation model with a cross-lingually shared layer that can be used as fixed-size sentence representation in different downstream tasks.

Machine Translation Sentence +1

Dependency Parsing with your Eyes: Dependency Structure Predicts Eye Regressions During Reading

no code implementations WS 2019 Aless Lopopolo, ro, Stefan L. Frank, Antal Van den Bosch, Roel Willems

Backward saccades during reading have been hypothesized to be involved in structural reanalysis, or to be related to the level of text difficulty.

Dependency Parsing

Analyzing the use of existing systems for the CLPsych 2019 Shared Task

no code implementations WS 2019 Alej Gonz{\'a}lez Hevia, ro, Rebeca Cerezo Men{\'e}ndez, Daniel Gayo-Avello

We explore the use of two systems trained with ReachOut data from the 2016 CLPsych task, and compare them to a baseline system trained with the data provided for this task.

General Classification

Quantifiers in a Multimodal World: Hallucinating Vision with Language and Sound

no code implementations WS 2019 Alberto Testoni, S Pezzelle, ro, Raffaella Bernardi

Inspired by the literature on multisensory integration, we develop a computational model to ground quantifiers in perception.

Adapting SimpleNLG to Galician language

1 code implementation WS 2018 Andrea Cascallar-Fuentes, Alej Ramos-Soto, ro, Alberto Bugar{\'\i}n Diz

In this paper, we describe SimpleNLG-GL, an adaptation of the linguistic realisation SimpleNLG library for the Galician language.

Text Generation

An Analysis of Encoder Representations in Transformer-Based Machine Translation

no code implementations WS 2018 Aless Raganato, ro, J{\"o}rg Tiedemann

We assess the representations of the encoder by extracting dependency relations based on self-attention weights, we perform four probing tasks to study the amount of syntactic and semantic captured information and we also test attention in a transfer learning scenario.

Feature Engineering Machine Translation +2

Designing and testing the messages produced by a virtual dietitian

no code implementations WS 2018 Luca Anselma, Aless Mazzei, ro

This paper presents a project about the automatic generation of persuasive messages in the context of the diet management.

Data-to-Text Generation Management

Supervised Clustering of Questions into Intents for Dialog System Applications

no code implementations EMNLP 2018 Iryna Haponchyk, Antonio Uva, Seunghak Yu, Olga Uryupina, Aless Moschitti, ro

Modern automated dialog systems require complex dialog managers able to deal with user intent triggered by high-level semantic questions.

Chatbot Clustering +3

The University of Helsinki submissions to the WMT18 news task

no code implementations WS 2018 Aless Raganato, ro, Yves Scherrer, Tommi Nieminen, Arvi Hurskainen, J{\"o}rg Tiedemann

This paper describes the University of Helsinki{'}s submissions to the WMT18 shared news translation task for English-Finnish and English-Estonian, in both directions.

Machine Translation Translation

Modeling Violations of Selectional Restrictions with Distributional Semantics

no code implementations WS 2018 Emmanuele Chersoni, Adri{\`a} Torrens Urrutia, Philippe Blache, Aless Lenci, ro

Distributional Semantic Models have been successfully used for modeling selectional preferences in a variety of scenarios, since distributional similarity naturally provides an estimate of the degree to which an argument satisfies the requirement of a given predicate.

Learning to Progressively Recognize New Named Entities with Sequence to Sequence Models

no code implementations COLING 2018 Lingzhen Chen, Aless Moschitti, ro

In this paper, we propose to use a sequence to sequence model for Named Entity Recognition (NER) and we explore the effectiveness of such model in a progressive NER setting {--} a Transfer Learning (TL) setting.

Feature Engineering named-entity-recognition +3

A Flexible, Efficient and Accurate Framework for Community Question Answering Pipelines

no code implementations ACL 2018 Salvatore Romeo, Giovanni Da San Martino, Alberto Barr{\'o}n-Cede{\~n}o, Aless Moschitti, ro

Although deep neural networks have been proving to be excellent tools to deliver state-of-the-art results, when data is scarce and the tackled tasks involve complex semantic inference, deep linguistic processing and traditional structure-based approaches, such as tree kernel methods, are an alternative solution.

Community Question Answering

The DipInfo-UniTo system for SRST 2018

no code implementations WS 2018 Valerio Basile, Aless Mazzei, ro

This paper describes the system developed by the DipInfo-UniTo team to participate to the shallow track of the Surface Realization Shared Task 2018.

Morphological Inflection Text Generation

Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students

no code implementations WS 2018 Alej Dorantes, ro, Gerardo Sierra, Tlauhlia Yam{\'\i}n Donohue P{\'e}rez, Gemma Bel-Enguix, M{\'o}nica Jasso Rosales

This work presents the Sociolinguistic Corpus of WhatsApp Chats in Spanish among College Students, a corpus of raw data for general use.

Adapting SimpleNLG to Spanish

no code implementations WS 2017 Alej Ramos-Soto, ro, Julio Janeiro-Gallardo, Alberto Bugar{\'\i}n Diz

We describe SimpleNLG-ES, an adaptation of the SimpleNLG realization library for the Spanish language.

Text Generation

What to Write? A topic recommender for journalists

no code implementations WS 2017 Aless Cucchiarelli, ro, Christian Morbidoni, Giovanni Stilo, Paola Velardi

In this paper we present a recommender system, What To Write and Why, capable of suggesting to a journalist, for a given event, the aspects still uncovered in news articles on which the readers focus their interest.

Recommendation Systems

Collaborative Partitioning for Coreference Resolution

no code implementations CONLL 2017 Olga Uryupina, Aless Moschitti, ro

This paper presents a collaborative partitioning algorithm{---}a novel ensemble-based approach to coreference resolution.

coreference-resolution

Sew-Embed at SemEval-2017 Task 2: Language-Independent Concept Representations from a Semantically Enriched Wikipedia

no code implementations SEMEVAL 2017 Claudio Delli Bovi, Aless Raganato, ro

This paper describes Sew-Embed, our language-independent approach to multilingual and cross-lingual semantic word similarity as part of the SemEval-2017 Task 2.

Semantic Textual Similarity Task 2 +1

Logical Metonymy in a Distributional Model of Sentence Comprehension

no code implementations SEMEVAL 2017 Emmanuele Chersoni, Aless Lenci, ro, Philippe Blache

In theoretical linguistics, logical metonymy is defined as the combination of an event-subcategorizing verb with an entity-denoting direct object (e. g., The author began the book), so that the interpretation of the VP requires the retrieval of a covert event (e. g., writing).

Retrieval Sentence

EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text

no code implementations ACL 2017 Claudio Delli Bovi, Jose Camacho-Collados, Aless Raganato, ro, Roberto Navigli

Parallel corpora are widely used in a variety of Natural Language Processing tasks, from Machine Translation to cross-lingual Word Sense Disambiguation, where parallel sentences can be exploited to automatically generate high-quality sense annotations on a large scale.

Entity Linking Machine Translation +2

Measuring the Italian-English lexical gap for action verbs and its impact on translation

no code implementations WS 2017 Lorenzo Gregori, Aless Panunzi, ro

This paper describes a method to measure the lexical gap of action verbs in Italian and English by using the IMAGACT ontology of action.

Translation

A Practical Perspective on Latent Structured Prediction for Coreference Resolution

no code implementations EACL 2017 Iryna Haponchyk, Aless Moschitti, ro

Latent structured prediction theory proposes powerful methods such as Latent Structural SVM (LSSVM), which can potentially be very appealing for coreference resolution (CR).

coreference-resolution feature selection +2

Lexfom: a lexical functions ontology model

no code implementations WS 2016 Alexs Fonseca, ro, Fatiha Sadat, Fran{\c{c}}ois Lareau

For example, the antonymy is a type of relation that is represented by the lexical function Anti: Anti(big) = small.

Relation

Selecting Sentences versus Selecting Tree Constituents for Automatic Question Ranking

no code implementations COLING 2016 Alberto Barr{\'o}n-Cede{\~n}o, Giovanni Da San Martino, Salvatore Romeo, Aless Moschitti, ro

Community question answering (cQA) websites are focused on users who query questions onto an online forum, expecting for other users to provide them answers or suggestions.

Community Question Answering Machine Translation +1

Towards a Distributional Model of Semantic Complexity

no code implementations WS 2016 Emmanuele Chersoni, Philippe Blache, Aless Lenci, ro

The composition cost of a sentence depends on the semantic coherence of the event being constructed and on the activation degree of the linguistic constructions.

Sentence

``Beware the Jabberwock, dear reader!'' Testing the distributional reality of construction semantics

no code implementations WS 2016 Gianluca Lebani, Aless Lenci, ro

Notwithstanding the success of the notion of construction, the computational tradition still lacks a way to represent the semantic content of these linguistic entities.

Semantic Indexing of Multilingual Corpora and its Application on the History Domain

no code implementations WS 2016 Aless Raganato, ro, Jose Camacho-Collados, Antonio Raganato, Yunseo Joung

The increasing amount of multilingual text collections available in different domains makes its automatic processing essential for the development of a given field.

Retrieval Text Retrieval +1

Antonymy and Canonicity: Experimental and Distributional Evidence

no code implementations WS 2016 Andreana Pastena, Aless Lenci, ro

Previous studies have showed that some pairs of antonyms are perceived to be better examples of opposition than others, and are so considered representative of the whole category (e. g., Deese, 1964; Murphy, 2003; Paradis et al., 2009).

Named Entity Recognition and Hashtag Decomposition to Improve the Classification of Tweets

no code implementations WS 2016 Billal Belainine, Alexs Fonseca, ro, Fatiha Sadat

We evaluate and compare several automatic classification systems using part or all of the items described in our contributions and found that filtering by part of speech and named entity recognition dramatically increase the classification precision to 77. 3 {\%}.

Classification General Classification +3

The CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations

no code implementations WS 2016 Enrico Santus, Anna Gladkova, Stefan Evert, Aless Lenci, ro

The task is split into two subtasks: (i) identification of related word pairs vs. unrelated ones; (ii) classification of the word pairs according to their semantic relation.

Language Acquisition Paraphrase Generation

Adapting an Entity Centric Model for Portuguese Coreference Resolution

no code implementations LREC 2016 Ev Fonseca, ro, Renata Vieira, Aline Vanin

This paper presents the adaptation of an Entity Centric Model for Portuguese coreference resolution, considering 10 named entity categories.

coreference-resolution

LexFr: Adapting the LexIt Framework to Build a Corpus-based French Subcategorization Lexicon

no code implementations LREC 2016 Giulia Rambelli, Gianluca Lebani, Laurent Pr{\'e}vot, Aless Lenci, ro

This paper introduces LexFr, a corpus-based French lexical resource built by adapting the framework LexIt, originally developed to describe the combinatorial potential of Italian predicates.

Italian VerbNet: A Construction-based Approach to Italian Verb Classification

no code implementations LREC 2016 Lucia Busso, Aless Lenci, ro

This paper proposes a new method for Italian verb classification -and a preliminary example of resulting classes- inspired by Levin (1993) and VerbNet (Kipper-Schuler, 2005), yet partially independent from these resources; we achieved such a result by integrating Levin and VerbNet{'}s models of classification with other theoretic frameworks and resources.

Classification General Classification

Evaluating Context Selection Strategies to Build Emotive Vector Space Models

no code implementations LREC 2016 Lucia C. Passaro, Aless Lenci, ro

In this paper we compare different context selection approaches to improve the creation of Emotive Vector Space Models (VSMs).

Towards Lexical Encoding of Multi-Word Expressions in Spanish Dialects

no code implementations LREC 2016 Diana Bogantes, Eric Rodr{\'\i}guez, Alej Arauco, Alej Rodr{\'\i}guez, ro, Agata Savary

This paper describes a pilot study in lexical encoding of multi-word expressions (MWEs) in 4 Latin American dialects of Spanish: Costa Rican, Colombian, Mexican and Peruvian.

The DIRHA simulated corpus

no code implementations LREC 2014 Luca Cristoforetti, Mirco Ravanelli, Maurizio Omologo, Aless Sosi, ro, Alberto Abad, Martin Hagmueller, Petros Maragos

This paper describes a multi-microphone multi-language acoustic corpus being developed under the EC project Distant-speech Interaction for Robust Home Applications (DIRHA).

Dialogue Management Distant Speech Recognition +2

Crowdsourcing for the identification of event nominals: an experiment

no code implementations LREC 2014 Rachele Sprugnoli, Aless Lenci, ro

This paper presents the design and results of a crowdsourcing experiment on the recognition of Italian event nominals.

Question Answering

Choosing which to use? A study of distributional models for nominal lexical semantic classification

no code implementations LREC 2014 Lauren Romeo, Gianluca Lebani, N{\'u}ria Bel, Aless Lenci, ro

This paper empirically evaluates the performances of different state-of-the-art distributional models in a nominal lexical semantic classification task.

General Classification Machine Translation +1

Bootstrapping an Italian VerbNet: data-driven analysis of verb alternations

no code implementations LREC 2014 Gianluca Lebani, Veronica Viola, Aless Lenci, ro

The goal of this paper is to propose a classification of the syntactic alternations admitted by the most frequent Italian verbs.

Classification General Classification +2

The SSPNet-Mobile Corpus: Social Signal Processing Over Mobile Phones.

no code implementations LREC 2014 Anna Polychroniou, Hugues Salamin, Aless Vinciarelli, ro

This article presents the SSPNet-Mobile Corpus, a collection of 60 mobile phone calls between unacquainted individuals (120 subjects).

SenTube: A Corpus for Sentiment Analysis on YouTube Social Media

no code implementations LREC 2014 Olga Uryupina, Barbara Plank, Aliaksei Severyn, Agata Rotondi, Aless Moschitti, ro

In this paper we present SenTube -- a dataset of user-generated comments on YouTube videos annotated for information content and sentiment polarity.

Document Classification Informativeness +3

Investigating the Image of Entities in Social Media: Dataset Design and First Results

no code implementations LREC 2014 Julien Velcin, Young-Min Kim, Caroline Brun, Jean-Yves Dormagen, Eric SanJuan, Leila Khouas, Anne Peradotto, Stephane Bonnevay, Claude Roux, Julien Boyadjian, Alej Molina, ro, Marie Neihouser

The objective of this paper is to describe the design of a dataset that deals with the image (i. e., representation, web reputation) of various entities populating the Internet: politicians, celebrities, companies, brands etc.

Clustering Information Retrieval +2

The annotation of the C-ORAL-BRASIL oral through the implementation of the Palavras Parser

no code implementations LREC 2012 Eckhard Bick, Heliana Mello, Aless Panunzi, ro, Tommaso Raso

This article describes the morphosyntactic annotation of the C-ORAL-BRASIL speech corpus, using an adapted version of the Palavras parser.

Lemmatization

RIDIRE-CPI: an Open Source Crawling and Processing Infrastructure for Supervised Web-Corpora Building

no code implementations LREC 2012 Aless Panunzi, ro, Marco Fabbri, Massimo Moneglia, Lorenzo Gregori, Samuele Paladini

This paper introduces the RIDIRE-CPI, an open source tool for the building of web corpora with a specific design through a targeted crawling strategy.

POS

Enriching the ISST-TANL Corpus with Semantic Frames

no code implementations LREC 2012 Aless Lenci, ro, Simonetta Montemagni, Giulia Venturi, Maria Grazia Cutrull{\`a}

The paper describes the design and the results of a manual annotation methodology devoted to enrich the ISST--TANL Corpus, derived from the Italian Syntactic--Semantic Treebank (ISST), with Semantic Frames information.

LexIt: A Computational Resource on Italian Argument Structure

no code implementations LREC 2012 Aless Lenci, ro, Gabriella Lapesa, Giulia Bonansinga

The aim of this paper is to introduce LexIt, a computational framework for the automatic acquisition and exploration of distributional information about Italian verbs, nouns and adjectives, freely available through a web interface at the address http://sesia. humnet. unipi. it/lexit.

Cannot find the paper you are looking for? You can Submit a new open access paper.