The WinoWhy dataset is a resource that provides human-annotated reasons for answering Winograd Schema Challenge (WSC) questions. It includes the original WSC dataset and 4095 WinoWhy reasons (15 for each WSC question) that could justify the pronoun coreference choices in WSC.

The reasons in WinoWhy come from three sources: 1. Human: Reasons provided by human beings. 2. Human Reverse: Human reasons for the paired WSC question. 3. Generation Model: The reasons generated by GPT-2 with the same question.

Each WSC question has 5 reasons from each source. These reasons are then used to categorize what types of commonsense knowledge are needed to solve the WSC question. The dataset also includes a new task called WinoWhy, which requires models to distinguish plausible reasons from very similar but wrong reasons for all WSC questions. This helps to investigate whether current WSC models can understand the commonsense or simply solve the WSC questions based on the statistical bias of the dataset.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages