BanglaEmotion (BanglaEmotion: A Benchmark Dataset for Bangla Textual Emotion Analysis)

BanglaEmotion is a manually annotated Bangla Emotion corpus, which incorporates the diversity of fine-grained emotion expressions in social-media text. More fine-grained emotion labels are considered such as Sadness, Happiness, Disgust, Surprise, Fear and Anger - which are, according to Paul Ekman (1999), the six basic emotion categories. For this task, a large amount of raw text data are collected from the user’s comments on two different Facebook groups (Ekattor TV and Airport Magistrates) and from the public post of a popular blogger and activist Dr. Imran H Sarker. These comments are mostly reactions to ongoing socio-political issues and towards the economic success and failure of Bangladesh. A total of 32923 comments are scraped from the three sources aforementioned above. Out of these, a total of 6314 comments were annotated into the six categories. The distribution of the annotated corpus is as follows:

sad = 1341 happy = 1908 disgust = 703 surprise = 562 fear = 384 angry = 1416

A balanced set is also provided from the above data and split the dataset into training and test set of equal ratio. A proportion of 5:1 is used for training and evaluation purposes. More information on the dataset and the experiments on it could be found in our paper (related links below).


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.