Online conversations can go in many directions: some turn out poorly due to antisocial behavior, while others turn out positively to the benefit of all.
Online abusive behavior affects millions and the NLP community has attempted to mitigate this problem by developing technologies to detect abuse.
In this dataset paper, we present a three-stage process to collect Reddit comments that are removed comments by moderators of several subreddits, for violating subreddit rules and guidelines.
Social and Information Networks
In this work, we: 1) develop machine learning models that predict whether a Twitter account is a Russian troll within a set of 170K control accounts; and, 2) demonstrate that it is possible to use this model to find active accounts on Twitter still likely acting on behalf of the Russian state.
Social and Information Networks Computers and Society