A quantitative benchmark for developing and understanding video of fill-in-the-blank question-answering dataset with over 300,000 examples, based on descriptive video annotations for the visually impaired.
Source: A dataset and exploration of models for understanding video data through fill-in-the-blank question-answeringPaper | Code | Results | Date | Stars |
---|