Search Results for author: Jose Camacho-Collados

Found 67 papers, 31 papers with code

On the Cross-lingual Transferability of Contextualized Sense Embeddings

no code implementations EMNLP (MRL) 2021 Kiamehr Rezaee, Daniel Loureiro, Jose Camacho-Collados, Mohammad Taher Pilehvar

In this paper we analyze the extent to which contextualized sense embeddings, i. e., sense embeddings that are computed based on contextualized word embeddings, are transferable across languages. To this end, we compiled a unified cross-lingual benchmark for Word Sense Disambiguation.

Word Embeddings Word Sense Disambiguation

Definition Extraction Feature Analysis: From Canonical to Naturally-Occurring Definitions

no code implementations COLING (CogALex) 2020 Mireia Roig Mirapeix, Luis Espinosa Anke, Jose Camacho-Collados

Textual definitions constitute a fundamental source of knowledge when seeking the meaning of words, and they are the cornerstone of lexical resources like glossaries, dictionaries, encyclopedia or thesauri.

Definition Extraction

Construction Artifacts in Metaphor Identification Datasets

no code implementations1 Nov 2023 Joanne Boisson, Luis Espinosa-Anke, Jose Camacho-Collados

Metaphor identification aims at understanding whether a given expression is used figuratively in context.

A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models

no code implementations19 Oct 2023 Yi Zhou, Jose Camacho-Collados, Danushka Bollegala

Various types of social biases have been reported with pretrained Masked Language Models (MLMs) in prior work.

RelBERT: Embedding Relations with Language Models

1 code implementation30 Sep 2023 Asahi Ushio, Jose Camacho-Collados, Steven Schockaert

In particular, we show that masked language models such as RoBERTa can be straightforwardly fine-tuned for this purpose, using only a small amount of training data.

Knowledge Graphs

Tweet Insights: A Visualization Platform to Extract Temporal Insights from Twitter

no code implementations4 Aug 2023 Daniel Loureiro, Kiamehr Rezaee, Talayeh Riahi, Francesco Barbieri, Leonardo Neves, Luis Espinosa Anke, Jose Camacho-Collados

This paper introduces a large collection of time series data derived from Twitter, postprocessed using word embedding techniques, as well as specialized fine-tuned language models.

Time Series

A Practical Toolkit for Multilingual Question and Answer Generation

1 code implementation27 May 2023 Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

Generating questions along with associated answers from a text has applications in several domains, such as creating reading comprehension tests for students, or improving document search by providing auxiliary questions and answers based on the query.

Answer Generation Reading Comprehension +1

An Empirical Comparison of LM-based Question and Answer Generation Methods

1 code implementation26 May 2023 Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

This task has a variety of applications, such as data augmentation for question answering (QA) models, information retrieval and education.

Answer Generation Data Augmentation +4

An Efficient Multilingual Language Model Compression through Vocabulary Trimming

1 code implementation24 May 2023 Asahi Ushio, Yi Zhou, Jose Camacho-Collados

Multilingual language model (LM) have become a powerful tool in NLP especially for non-English languages.

Language Modelling Model Compression

Generative Language Models for Paragraph-Level Question Generation

1 code implementation8 Oct 2022 Asahi Ushio, Fernando Alva-Manchego, Jose Camacho-Collados

It includes general-purpose datasets such as SQuAD for English, datasets from ten domains and two styles, as well as datasets in eight different languages.

Question Answering Question Generation +1

Assessing the Limits of the Distributional Hypothesis in Semantic Spaces: Trait-based Relational Knowledge and the Impact of Co-occurrences

1 code implementation *SEM (NAACL) 2022 Mark Anderson, Jose Camacho-Collados

The increase in performance in NLP due to the prevalence of distributional models and deep learning has brought with it a reciprocal decrease in interpretability.

Negativity Spreads Faster: A Large-Scale Multilingual Twitter Analysis on the Role of Sentiment in Political Communication

1 code implementation1 Feb 2022 Dimosthenis Antypas, Alun Preece, Jose Camacho-Collados

Social media has become extremely influential when it comes to policy making in modern societies, especially in the western world, where platforms such as Twitter allow users to follow politicians, thus making citizens more involved in political discussion.

Sentiment Analysis

Guiding Generative Language Models for Data Augmentation in Few-Shot Text Classification

no code implementations17 Nov 2021 Aleksandra Edwards, Asahi Ushio, Jose Camacho-Collados, Hélène de Ribaupierre, Alun Preece

Data augmentation techniques are widely used for enhancing the performance of machine learning models by tackling class imbalance issues and data sparsity.

Active Learning Data Augmentation +3

Distilling Relation Embeddings from Pre-trained Language Models

1 code implementation21 Sep 2021 Asahi Ushio, Jose Camacho-Collados, Steven Schockaert

Among others, this makes it possible to distill high-quality word vectors from pre-trained language models.

Knowledge Graphs Language Modelling +2

Deriving Disinformation Insights from Geolocalized Twitter Callouts

1 code implementation6 Aug 2021 David Tuxworth, Dimosthenis Antypas, Luis Espinosa-Anke, Jose Camacho-Collados, Alun Preece, David Rogers

In particular, the analysis in centered on Twitter and disinformation for three European languages: English, French and Spanish.

Language Modelling Specificity +1

LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond

1 code implementation26 May 2021 Daniel Loureiro, Alípio Mário Jorge, Jose Camacho-Collados

Prior work has shown that these contextual representations can be used to accurately represent large sense inventories as sense embeddings, to the extent that a distance-based solution to Word Sense Disambiguation (WSD) tasks outperforms models trained specifically for the task.

Word Sense Disambiguation

Embeddings in Natural Language Processing

no code implementations COLING 2020 Jose Camacho-Collados, Mohammad Taher Pilehvar

Embeddings have been one of the most important topics of interest in NLP for the past decade.

Word Embeddings

Go Simple and Pre-Train on Domain-Specific Corpora: On the Role of Training Data for Text Classification

no code implementations COLING 2020 Aleksandra Edwards, Jose Camacho-Collados, H{\'e}l{\`e}ne De Ribaupierre, Alun Preece

Pre-trained language models provide the foundations for state-of-the-art performance across a wide range of natural language processing tasks, including text classification.

Language Modelling text-classification +2

Understanding the Source of Semantic Regularities in Word Embeddings

no code implementations CONLL 2020 Hsiao-Yu Chiang, Jose Camacho-Collados, Zachary Pardos

In this paper, we investigate the hypothesis that examples of a lexical relation in a corpus are fundamental to a neural word embedding{'}s ability to complete analogies involving the relation.

Relation Word Embeddings

Analysis and Evaluation of Language Models for Word Sense Disambiguation

1 code implementation CL (ACL) 2021 Daniel Loureiro, Kiamehr Rezaee, Mohammad Taher Pilehvar, Jose Camacho-Collados

We also perform an in-depth comparison of the two main language model based WSD strategies, i. e., fine-tuning and feature extraction, finding that the latter approach is more robust with respect to sense bias and it can better exploit limited available training data.

Language Modelling Word Sense Disambiguation

WiC-TSV: An Evaluation Benchmark for Target Sense Verification of Words in Context

1 code implementation EACL 2021 Anna Breit, Artem Revenko, Kiamehr Rezaee, Mohammad Taher Pilehvar, Jose Camacho-Collados

More specifically, we introduce a framework for Target Sense Verification of Words in Context which grounds its uniqueness in the formulation as a binary classification task thus being independent of external sense inventories, and the coverage of various domains.

 Ranked #1 on Entity Linking on WiC-TSV (Task 3 Accuracy: all metric)

Binary Classification Entity Linking +1

Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation

1 code implementation EMNLP 2020 Daniel Loureiro, Jose Camacho-Collados

State-of-the-art methods for Word Sense Disambiguation (WSD) combine two different features: the power of pre-trained language models and a propagation method to extend the coverage of such models.

Word Sense Disambiguation

Modelling Semantic Categories using Conceptual Neighborhood

no code implementations3 Dec 2019 Zied Bouraoui, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert

Unfortunately, meaningful regions can be difficult to estimate, especially since we often have few examples of individuals that belong to a given category.

Inducing Relational Knowledge from BERT

no code implementations28 Nov 2019 Zied Bouraoui, Jose Camacho-Collados, Steven Schockaert

Starting from a few seed instances of a given relation, we first use a large text corpus to find sentences that are likely to express this relation.

Language Modelling Relation +1

Meemi: A Simple Method for Post-processing and Integrating Cross-lingual Word Embeddings

no code implementations16 Oct 2019 Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert

While monolingual word embeddings encode information about words in the context of a particular language, cross-lingual embeddings define a multilingual space where word embeddings from two or more languages are integrated together.

Cross-Lingual Natural Language Inference Cross-Lingual Word Embeddings +3

On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning

no code implementations LREC 2020 Yerai Doval, Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert

Cross-lingual word embeddings are vector representations of words in different languages where words with similar meaning are represented by similar vectors, regardless of the language.

Cross-Lingual Word Embeddings Word Embeddings

Relational Word Embeddings

1 code implementation ACL 2019 Jose Camacho-Collados, Luis Espinosa-Anke, Steven Schockaert

While word embeddings have been shown to implicitly encode various forms of attributional knowledge, the extent to which they capture relational information is far more limited.

Word Embeddings

Interpretable Emoji Prediction via Label-Wise Attention LSTMs

no code implementations EMNLP 2018 Francesco Barbieri, Luis Espinosa-Anke, Jose Camacho-Collados, Steven Schockaert, Horacio Saggion

Human language has evolved towards newer forms of communication such as social media, where emojis (i. e., ideograms bearing a visual meaning) play a key role.

Emotion Recognition Information Retrieval +3

The Interplay between Lexical Resources and Natural Language Processing

1 code implementation NAACL 2018 Jose Camacho-Collados, Luis Espinosa-Anke, Mohammad Taher Pilehvar

Incorporating linguistic, world and common sense knowledge into AI/NLP systems is currently an important research area, with several open problems and challenges.

Common Sense Reasoning

How Gender and Skin Tone Modifiers Affect Emoji Semantics in Twitter

1 code implementation SEMEVAL 2018 Francesco Barbieri, Jose Camacho-Collados

Our analyses reveal that some stereotypes related to the skin color and gender seem to be reflected on the use of these modifiers.

Word Embeddings

From Word to Sense Embeddings: A Survey on Vector Representations of Meaning

no code implementations10 May 2018 Jose Camacho-Collados, Mohammad Taher Pilehvar

Over the past years, distributed semantic representations have proved to be effective and flexible keepers of prior knowledge to be integrated into downstream applications.

A Short Survey on Sense-Annotated Corpora

no code implementations LREC 2020 Tommaso Pasini, Jose Camacho-Collados

Large sense-annotated datasets are increasingly necessary for training deep supervised systems in Word Sense Disambiguation.

Word Sense Disambiguation

SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity

no code implementations SEMEVAL 2017 Jose Camacho-Collados, Mohammad Taher Pilehvar, Nigel Collier, Roberto Navigli

This paper introduces a new task on Multilingual and Cross-lingual SemanticThis paper introduces a new task on Multilingual and Cross-lingual Semantic Word Similarity which measures the semantic similarity of word pairs within and across five languages: English, Farsi, German, Italian and Spanish.

Information Retrieval Machine Translation +9

On the Role of Text Preprocessing in Neural Network Architectures: An Evaluation Study on Text Categorization and Sentiment Analysis

3 code implementations WS 2018 Jose Camacho-Collados, Mohammad Taher Pilehvar

In this paper we investigate the impact of simple text preprocessing decisions (particularly tokenizing, lemmatizing, lowercasing and multiword grouping) on the performance of a standard neural text classifier.

Sentiment Analysis Text Categorization +1

EuroSense: Automatic Harvesting of Multilingual Sense Annotations from Parallel Text

no code implementations ACL 2017 Claudio Delli Bovi, Jose Camacho-Collados, Aless Raganato, ro, Roberto Navigli

Parallel corpora are widely used in a variety of Natural Language Processing tasks, from Machine Translation to cross-lingual Word Sense Disambiguation, where parallel sentences can be exploited to automatically generate high-quality sense annotations on a large scale.

Entity Linking Machine Translation +2

BabelDomains: Large-Scale Domain Labeling of Lexical Resources

no code implementations EACL 2017 Jose Camacho-Collados, Roberto Navigli

In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge.

Clustering Domain Adaptation +4

Why we have switched from building full-fledged taxonomies to simply detecting hypernymy relations

no code implementations12 Mar 2017 Jose Camacho-Collados

The study of taxonomies and hypernymy relations has been extensive on the Natural Language Processing (NLP) literature.

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

no code implementations CONLL 2017 Massimiliano Mancini, Jose Camacho-Collados, Ignacio Iacobacci, Roberto Navigli

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora.

Word Embeddings

Semantic Indexing of Multilingual Corpora and its Application on the History Domain

no code implementations WS 2016 Aless Raganato, ro, Jose Camacho-Collados, Antonio Raganato, Yunseo Joung

The increasing amount of multilingual text collections available in different domains makes its automatic processing essential for the development of a given field.

Retrieval Text Retrieval +1

Cannot find the paper you are looking for? You can Submit a new open access paper.