The question-answer (QA) pairs are automatically generated using state-of-the-art question generation methods based on paintings and comments provided in an existing art understanding dataset.
6 PAPERS • NO BENCHMARKS YET
…(bachelor_of_arts, juris_doctor).
202 PAPERS • 3 BENCHMARKS
…textual question answering benchmark for spatial reasoning on natural language text which contains more realistic spatial phenomena not covered by prior datasets and that is challenging for state-of-the-art
8 PAPERS • NO BENCHMARKS YET
…The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging
24 PAPERS • 2 BENCHMARKS
…While all questions directly relate to the passage, the English dataset on its own proves difficult enough to challenge state-of-the-art language models.
19 PAPERS • NO BENCHMARKS YET
…Experimental evaluation shows that a host of baselines and state-of-the-art models based on shallow language understanding struggle to achieve a high score on the Story Cloze Test.
44 PAPERS • 1 BENCHMARK