The EmoTag1200 dataset is a collection of resources for analyzing the emotion and sentiment of emojis as well as tweets written in English. The name EmoTag indicates its usefulness in exploiting emojis for emotional tagging.

The dataset includes the following resources: - Baseline Emoji Emotion Scores: 1200 Emoji-Emotion pairs annotated by humans. It contains emotion scores ranging from 0 to 1 for the 150 most popular Twitter emojis for 8 emotion classes (i.e., anger, anticipation, disgust, fear, joy, sadness, surprise, and trust). - Interpretable Word Vectors: A 620-dimensional vector representation of words and emojis trained on ~20.8 million emoji-centric Twitter data. - Raw Tweets: This contains Tweet IDs of ~20.8 million tweets used in the experiments. - Word-Emoji Co-occurrence Frequencies: This lexicon provides word-emoji co-occurrence frequencies observed in the dataset. - Emoji-Emoji Co-occurrence Frequencies: This is the subset of the previous lexicon which contains only emoji-emoji co-occurrence counts observed in the dataset.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages