The ActivityNet-QA dataset contains 58,000 human-annotated QA pairs on 5,800 videos derived from the popular ActivityNet dataset. The dataset provides a benchmark for testing the performance of VideoQA models on long-term spatio-temporal reasoning.
Source: ActivityNet-QA: A Dataset for Understanding Complex Web Videos via Question AnsweringPaper | Code | Results | Date | Stars |
---|