BABE is an expertly annotated dataset aimed at facilitating media bias research. Comprising 3,700 sentences that are evenly distributed across various topics and outlets. Each sentence in the dataset is annotated for media bias at both the word and sentence levels. The development of BABE involved a process of data collection and annotation, focusing on sentences extracted from news articles that span a range of predefined controversial topics and were published across different U.S. media platforms between January 2017 and June 2020.
A key aspect of the BABE dataset is its reliance on expert annotation to ensure high-quality data. Expert annotators were selected based on their experience in the domain of media bias, and they underwent comprehensive training to ensure consistent and neutral annotation practices. This approach contrasts with crowd-sourced annotations and is designed to yield more reliable and qualitative bias labels.
Paper | Code | Results | Date | Stars |
---|