The SOLO Corpus comprises over 4 million English tweets, each of which contains at least one of the following tokens: solitude, lonely, and loneliness. The corpus has been collected to analyze the language and emotions associated with the state of being alone in English tweets.

Tweets related to the state of being alone were collected by polling the Twitter API from August 28, 2018 to July 10, 2019 with the following query terms: loneliness, lonely, and solitude. Duplicate tweets, short tweets (containing less than three words), and tweets with external URLs were discarded. Further, only up to three tweets per user are kept. This minimizes the impact of prolific tweeters and bots on the corpus.

Source: SOLO: A Corpus of Tweets for Examining the State of Being Alone

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • Unknown

Modalities


Languages