We take advantage of the ground truth of NLVR images, design CFGs to generate stories, and use spatial reasoning rules to ask and answer spatial reasoning questions. This automatically generated data is called SpaRTQA. https://aclanthology.org/2021.naacl-main.364/
10 PAPERS • NO BENCHMARKS YET
MuSiQue-Ans is a new multihop QA dataset with ~25K 2-4 hop questions using seed questions from 5 existing single-hop datasets.
5 PAPERS • 1 BENCHMARK
ConcurrentQA is a textual multi-hop QA benchmark to require concurrent retrieval over multiple data-distributions (i.e. Wikipedia and email data). The dataset follow the exact same schema and design as HotpotQA. The data set is downloadable here: https://github.com/facebookresearch/concurrentqa. It also contains model and result analysis code. This benchmark can also be used to study privacy when reasoning over data distributed in multiple privacy scopes --- i.e. Wikipedia in the public domain and emails in the private domain.
2 PAPERS • 1 BENCHMARK
MultiQ is a multi-hop QA dataset for Russian, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks.