BERTweet: A pre-trained language model for English Tweets

20 May 2020 Dat Quoc Nguyen Thanh Vu Anh Tuan Nguyen

We present BERTweet, the first public large-scale pre-trained language model for English Tweets. Our BERTweet, having the same architecture as BERT-base (Devlin et al., 2019), is trained using the RoBERTa pre-training procedure (Liu et al., 2019)... (read more)

PDF Abstract

Datasets


Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Part-Of-Speech Tagging Ritter BERTweet Acc 90.1 # 4
Part-Of-Speech Tagging Tweebank BERTweet Acc 95.2 # 2
Sentiment Analysis TweetEval BERTweet Emoji 33.4 # 1
Emotion 79.3 # 2
Hate 56.4 # 1
Irony 82.1 # 1
Offensive 79.5 # 2
Sentiment 73.4 # 1
Stance 71.2 # 1
ALL 67.9 # 1

Methods used in the Paper


METHOD TYPE
Weight Decay
Regularization
Softmax
Output Functions
Adam
Stochastic Optimization
Multi-Head Attention
Attention Modules
Dropout
Regularization
GELU
Activation Functions
Attention Dropout
Regularization
Linear Warmup With Linear Decay
Learning Rate Schedules
Dense Connections
Feedforward Networks
Layer Normalization
Normalization
Scaled Dot-Product Attention
Attention Mechanisms
WordPiece
Subword Segmentation
Residual Connection
Skip Connections
BERT
Language Models
RoBERTa
Transformers