Search Results for author: Leonardo Neves

Found 28 papers, 13 papers with code

USE: Dynamic User Modeling with Stateful Sequence Models

no code implementations • 20 Mar 2024 • Zhihan Zhou, Qixiang Fang, Leonardo Neves, Francesco Barbieri, Yozen Liu, Han Liu, Maarten W. Bos, Ron Dotsch

Furthermore, we introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction by forecasting a broader horizon of upcoming user behaviors.

Contrastive Learning

Paper
Add Code

Designing and Evaluating General-Purpose User Representations Based on Behavioral Logs from a Measurement Process Perspective: A Case Study with Snapchat

no code implementations • 19 Dec 2023 • Qixiang Fang, Zhihan Zhou, Francesco Barbieri, Yozen Liu, Leonardo Neves, Dong Nguyen, Daniel L. Oberski, Maarten W. Bos, Ron Dotsch

Using this new framework, we design a Transformer-based user model that can produce high-quality general-purpose user representations for instant messaging platforms like Snapchat.

Representation Learning

Paper
Add Code

SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research

no code implementations • 23 Oct 2023 • Dimosthenis Antypas, Asahi Ushio, Francesco Barbieri, Leonardo Neves, Kiamehr Rezaee, Luis Espinosa-Anke, Jiaxin Pei, Jose Camacho-Collados

Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks.

Language Modelling

Paper
Add Code

Context-aware Adversarial Attack on Named Entity Recognition

no code implementations • 16 Sep 2023 • Shuguang Chen, Leonardo Neves, Thamar Solorio

In recent years, large pre-trained language models (PLMs) have achieved remarkable performance on many natural language processing benchmarks.

Adversarial Attack named-entity-recognition +1

Paper
Add Code

Tweet Insights: A Visualization Platform to Extract Temporal Insights from Twitter

no code implementations • 4 Aug 2023 • Daniel Loureiro, Kiamehr Rezaee, Talayeh Riahi, Francesco Barbieri, Leonardo Neves, Luis Espinosa Anke, Jose Camacho-Collados

This paper introduces a large collection of time series data derived from Twitter, postprocessed using word embedding techniques, as well as specialized fine-tuned language models.

Time Series

Paper
Add Code

Style Transfer as Data Augmentation: A Case Study on Named Entity Recognition

1 code implementation • 14 Oct 2022 • Shuguang Chen, Leonardo Neves, Thamar Solorio

In this work, we take the named entity recognition task in the English language as a case study and explore style transfer as a data augmentation method to increase the size and diversity of training data in low-resource scenarios.

Data Augmentation named-entity-recognition +4

Paper
Code

Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

1 code implementation • 7 Oct 2022 • Asahi Ushio, Leonardo Neves, Vitor Silva, Francesco Barbieri, Jose Camacho-Collados

Recent progress in language model pre-training has led to important improvements in Named Entity Recognition (NER).

Language Modelling Named Entity Recognition

351

Paper
Code

SemEval 2023 Task 9: Multilingual Tweet Intimacy Analysis

no code implementations • 3 Oct 2022 • Jiaxin Pei, Vítor Silva, Maarten Bos, Yozon Liu, Leonardo Neves, David Jurgens, Francesco Barbieri

We propose MINT, a new Multilingual INTimacy analysis dataset covering 13, 372 tweets in 10 languages including English, French, Spanish, Italian, Portuguese, Korean, Dutch, Chinese, Hindi, and Arabic.

Paper
Add Code

Twitter Topic Classification

no code implementations • COLING 2022 • Dimosthenis Antypas, Asahi Ushio, Jose Camacho-Collados, Leonardo Neves, Vítor Silva, Francesco Barbieri

Social media platforms host discussions about a wide variety of topics that arise everyday.

Classification Topic Classification

Paper
Add Code

TempoWiC: An Evaluation Benchmark for Detecting Meaning Shift in Social Media

1 code implementation • COLING 2022 • Daniel Loureiro, Aminette D'Souza, Areej Nasser Muhajab, Isabella A. White, Gabriel Wong, Luis Espinosa Anke, Leonardo Neves, Francesco Barbieri, Jose Camacho-Collados

To bridge this gap, we present TempoWiC, a new benchmark especially aimed at accelerating research in social media-based meaning shift.

Paper
Code

TweetNLP: Cutting-Edge Natural Language Processing for Social Media

1 code implementation • 29 Jun 2022 • Jose Camacho-Collados, Kiamehr Rezaee, Talayeh Riahi, Asahi Ushio, Daniel Loureiro, Dimosthenis Antypas, Joanne Boisson, Luis Espinosa-Anke, Fangyu Liu, Eugenio Martínez-Cámara, Gonzalo Medina, Thomas Buhrmann, Leonardo Neves, Francesco Barbieri

In this paper we present TweetNLP, an integrated platform for Natural Language Processing (NLP) in social media.

Language Identification Named Entity Recognition +2

274

Paper
Code

TimeLMs: Diachronic Language Models from Twitter

2 code implementations • ACL 2022 • Daniel Loureiro, Francesco Barbieri, Leonardo Neves, Luis Espinosa Anke, Jose Camacho-Collados

Despite its importance, the time variable has been largely neglected in the NLP and language model literature.

Continual Learning Language Modelling

341

Paper
Code

Data Augmentation for Cross-Domain Named Entity Recognition

1 code implementation • EMNLP 2021 • Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio

Current work in named entity recognition (NER) shows that data augmentation techniques can produce more robust models.

Cross-Domain Named Entity Recognition Data Augmentation +3

Paper
Code

Mitigating Temporal-Drift: A Simple Approach to Keep NER Models Crisp

1 code implementation • NAACL (SocialNLP) 2021 • Shuguang Chen, Leonardo Neves, Thamar Solorio

Performance of neural models for named entity recognition degrades over time, becoming stale.

named-entity-recognition Named Entity Recognition +1

Paper
Code

Efficient Learning of Less Biased Models with Transfer Learning

no code implementations • 1 Jan 2021 • Xisen Jin, Francesco Barbieri, Leonardo Neves, Xiang Ren

Prediction bias in machine learning models, referring to undesirable model behaviors that discriminates inputs mentioning or produced by certain group, has drawn increasing attention from the research community given its societal impact.

Transfer Learning

Paper
Add Code

The Devil is in the Details: Evaluating Limitations of Transformer-based Methods for Granular Tasks

1 code implementation • COLING 2020 • Brihi Joshi, Neil Shah, Francesco Barbieri, Leonardo Neves

Contextual embeddings derived from transformer-based neural language models have shown state-of-the-art performance for various tasks such as question answering, sentiment analysis, and textual similarity in recent years.

Question Answering Sentiment Analysis

Paper
Code

On Transferability of Bias Mitigation Effects in Language Model Fine-Tuning

no code implementations • NAACL 2021 • Xisen Jin, Francesco Barbieri, Brendan Kennedy, Aida Mostafazadeh Davani, Leonardo Neves, Xiang Ren

Fine-tuned language models have been shown to exhibit biases against protected groups in a host of modeling tasks such as text classification and coreference resolution.

coreference-resolution Fairness +6

Paper
Add Code

TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Francesco Barbieri, Jose Camacho-Collados, Leonardo Neves, Luis Espinosa-Anke

The experimental landscape in natural language processing for social media is too fragmented.

Ranked #3 on Sentiment Analysis on TweetEval

Classification General Classification +2

341

Paper
Code

Can images help recognize entities? A study of the role of images for Multimodal NER

1 code implementation • WNUT (ACL) 2021 • Shuguang Chen, Gustavo Aguilar, Leonardo Neves, Thamar Solorio

Multimodal named entity recognition (MNER) requires to bridge the gap between language understanding and visual context.

Image Captioning named-entity-recognition +2

Paper
Code

Data Augmentation for Graph Neural Networks

2 code implementations • 11 Jun 2020 • Tong Zhao, Yozen Liu, Leonardo Neves, Oliver Woodford, Meng Jiang, Neil Shah

Our work shows that neural edge predictors can effectively encode class-homophilic structure to promote intra-class edges and demote inter-class edges in given graph structure, and our main contribution introduces the GAug graph data augmentation framework, which leverages these insights to improve performance in GNN-based node classification via edge prediction.

Ranked #1 on Node Classification on Flickr

Data Augmentation General Classification +1

181

Paper
Code

LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation

no code implementations • ACL 2020 • Dong-Ho Lee, Rahul Khanna, Bill Yuchen Lin, Jamin Chen, Seyeon Lee, Qinyuan Ye, Elizabeth Boschee, Leonardo Neves, Xiang Ren

Successfully training a deep neural network demands a huge corpus of labeled data.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

Learning from Explanations with Neural Execution Tree

1 code implementation • ICLR 2020 • Ziqi Wang, Yujia Qin, Wenxuan Zhou, Jun Yan, Qinyuan Ye, Leonardo Neves, Zhiyuan Liu, Xiang Ren

While deep neural networks have achieved impressive performance on a range of NLP tasks, these data-hungry models heavily rely on labeled data, which restricts their applications in scenarios where data annotation is expensive.

Data Augmentation Multi-hop Question Answering +6

Paper
Code

NERO: A Neural Rule Grounding Framework for Label-Efficient Relation Extraction

2 code implementations • 5 Sep 2019 • Wenxuan Zhou, Hongtao Lin, Bill Yuchen Lin, Ziqi Wang, Junyi Du, Leonardo Neves, Xiang Ren

The soft matching module learns to match rules with semantically similar sentences such that raw corpora can be automatically labeled and leveraged by the RE module (in a much better coverage) as augmented supervision, in addition to the exactly matched sentences.

Relation Relation Extraction +1

Paper
Code

Train One Get One Free: Partially Supervised Neural Network for Bug Report Duplicate Detection and Clustering

no code implementations • NAACL 2019 • Lahari Poddar, Leonardo Neves, William Brendel, Luis Marujo, Sergey Tulyakov, Pradeep Karuturi

Leveraging the assumption that learning the topic of a bug is a sub-task for detecting duplicates, we design a loss function that can jointly perform both tasks but needs supervision for only duplicate classification, achieving topic clustering in an unsupervised fashion.

Clustering General Classification

Paper
Add Code

Visual Attention Model for Name Tagging in Multimodal Social Media

no code implementations • ACL 2018 • Di Lu, Leonardo Neves, Vitor Carvalho, Ning Zhang, Heng Ji

Everyday billions of multimodal posts containing both images and text are shared in social media sites such as Snapchat, Twitter or Instagram.

Natural Language Understanding Question Answering

Paper
Add Code

Multimodal Named Entity Disambiguation for Noisy Social Media Posts

no code implementations • ACL 2018 • Seungwhan Moon, Leonardo Neves, Vitor Carvalho

We introduce the new Multimodal Named Entity Disambiguation (MNED) task for multimodal social media posts such as Snapchat or Instagram captions, which are composed of short captions with accompanying images.

Entity Disambiguation Image Captioning +2

Paper
Add Code

Multimodal Named Entity Recognition for Short Social Media Posts

no code implementations • NAACL 2018 • Seungwhan Moon, Leonardo Neves, Vitor Carvalho

We introduce a new task called Multimodal Named Entity Recognition (MNER) for noisy user-generated data such as tweets or Snapchat captions, which comprise short text with accompanying images.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Visual Features for Context-Aware Speech Recognition

no code implementations • 1 Dec 2017 • Abhinav Gupta, Yajie Miao, Leonardo Neves, Florian Metze

We are working on a corpus of "how-to" videos from the web, and the idea is that an object that can be seen ("car"), or a scene that is being detected ("kitchen") can be used to condition both models on the "context" of the recording, thereby reducing perplexity and improving transcription.

Language Modelling speech-recognition +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.