#Election2020: The First Public Twitter Dataset on the 2020 US Presidential Election

1 Oct 2020  ·  Emily Chen, Ashok Deb, Emilio Ferrara ·

The integrity of democratic political discourse is at the core to guarantee free and fair elections. With social media often dictating the tones and trends of politics-related discussion, it is of paramount important to be able to study online chatter, especially in the run up to important voting events, like in the case of the upcoming November 3, 2020 U.S. Presidential Election. Limited access to social media data is often the first barrier to impede, hinder, or slow down progress, and ultimately our understanding of online political discourse. To mitigate this issue and try to empower the Computational Social Science research community, we decided to publicly release a massive-scale, longitudinal dataset of U.S. politics- and election-related tweets. This multilingual dataset that we have been collecting for over one year encompasses hundreds of millions of tweets and tracks all salient U.S. politics trends, actors, and events between 2019 and 2020. It predates and spans the whole period of Republican and Democratic primaries, with real-time tracking of all presidential contenders of both sides of the isle. After that, it focuses on presidential and vice-presidential candidates. Our dataset release is curated, documented and will be constantly updated on a weekly-basis, until the November 3, 2020 election and beyond. We hope that the academic community, computational journalists, and research practitioners alike will all take advantage of our dataset to study relevant scientific and social issues, including problems like misinformation, information manipulation, interference, and distortion of online political discourse that have been prevalent in the context of recent election events in the United States and worldwide. Our dataset is available at: https://github.com/echen102/us-pres-elections-2020

PDF Abstract

Datasets


Introduced in the Paper:

Election2020