CICERO contains 53,000 inferences for five commonsense dimensions -- cause, subsequent event, prerequisite, motivation, and emotional reaction -- collected from 5600 dialogues. It involves two challenging generative and multi-choice alternative selection tasks for the state-of-the-art NLP models to solve. Download the dataset using this link.
12 PAPERS • 4 BENCHMARKS
QAMPARI is an ODQA benchmark, where question answers are lists of entities, spread across many paragraphs. It was created by (a) generating questions with multiple answers from Wikipedia's knowledge graph and tables, (b) automatically pairing answers with supporting evidence in Wikipedia paragraphs, and (c) manually paraphrasing questions and validating each answer.
11 PAPERS • NO BENCHMARKS YET
Dataset Description The dataset described in the provided text is focused on social media polls collected from Weibo, a popular Chinese microblogging platform. The dataset aims to empirically study social media polls and analyze user engagement patterns.
3 PAPERS • 3 BENCHMARKS