2 dataset results for segmentation AND Video Question Answering

…Each worker is assigned with one video segment and asked to write one question with four answer candidates (one correctand three distractors).

22 PAPERS • 2 BENCHMARKS

Perception Test

…The videos are densely annotated with six types of labels: object and point tracks, temporal action and sound segments, multiple-choice video question-answers and grounded video question-answers.

4 PAPERS • NO BENCHMARKS YET

Datasets

2 dataset results for segmentation AND Video Question Answering