MuMiN is a misinformation graph dataset containing rich social media data (tweets, replies, users, images, articles, hashtags), spanning 21 million tweets belonging to 26 thousand Twitter threads, each of which have been semantically linked to 13 thousand fact-checked claims across dozens of topics, events and domains, in 41 different languages, spanning more than a decade.
4 PAPERS • 3 BENCHMARKS
Detecting out-of-context media, such as "mis-captioned" images on Twitter, is a relevant problem, especially in domains of high public significance. Twitter-COMMs is a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles. This dataset can be used to develop methods to detect misinformation on social media platforms related to these three topics.
2 PAPERS • NO BENCHMARKS YET
MMVax-Stance includes 113 Vaccine Hesitancy Framings found on Twitter about the COVID-19 vaccines. Language experts annotated multimodal image-text tweets as Relevant or Not Relevant, and then further annotated Relevant tweets with Stance towards each framing.
1 PAPER • NO BENCHMARKS YET