A Qualitative Comparison of CoQA, SQuAD 2.0 and QuAC

NAACL 2019 Mark Yatskar

We compare three new datasets for question answering: SQuAD 2.0, QuAC, and CoQA, along several of their new features: (1) unanswerable questions, (2) multi-turn interactions, and (3) abstractive answers. We show that the datasets provide complementary coverage of the first two aspects, but weak coverage of the third... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Question Answering CoQA BiDAF++ (single model) In-domain 69.4 # 6
Out-of-domain 63.8 # 6
Overall 67.8 # 7

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet