iVQA (Instructional Video Question Answering)

Introduced by Yang et al. in Just Ask: Learning to Answer Questions from Millions of Narrated Videos

An open-ended VideoQA benchmark that aims to: i) provide a well-defined evaluation by including five correct answer annotations per question and ii) avoid questions which can be answered without the video.

iVQA contains 10,000 video clips with one question and five corresponding answers per clip. Moreover, we manually reduce the language bias by excluding questions that could be answered without watching the video.

Source: Just Ask: Learning to Answer Questions from Millions of Narrated Videos

Homepage