…The dataset consists of video retrieval, moment retrieval, and two novel moment segmentation and step captioning tasks.
2 PAPERS • NO BENCHMARKS YET