Detecting Anxiety through Reddit

WS 2017 · Judy Hanwen Shen, Frank Rudzicz ·

Previous investigations into detecting mental illnesses through social media have predominately focused on detecting depression through Twitter corpora. In this paper, we study anxiety disorders through personal narratives collected through the popular social media website, Reddit. We build a substantial data set of typical and anxiety-related posts, and we apply N-gram language modeling, vector embeddings, topic analysis, and emotional norms to generate features that accurately classify posts related to binary levels of anxiety. We achieve an accuracy of 91{\%} with vector-space word embeddings, and an accuracy of 98{\%} when combined with lexicon-based features.

PDF Abstract