We develop a primary dataset based on our task of suicide or depression classification. This dataset is web-scraped from Reddit. We collect our data from subreddits using the Python Reddit API. We specifically scrape from two subreddits, r/SuicideWatch3 and r/Depression. The dataset contains 1,895 total posts. We utilize two fields from the scraped data: the original text of the post as our inputs, and the subreddit it belongs to as labels. Posts from r/SuicideWatch are labeled as suicidal, and posts from r/Depression are labeled as depressed. We make this dataset and the web-scraping script available in our code.
Paper | Code | Results | Date | Stars |
---|