no code implementations • 27 Mar 2024 • Ali Zare, Yulei Niu, Hammad Ayyubi, Shih-Fu Chang
(3) Annotation cost: Annotating instructional videos with step-level labels (i. e., timestamp) or sequence-level labels (i. e., action category) is demanding and labor-intensive, limiting its generalizability to large-scale datasets. In this work, we propose a new and practical setting, called adaptive procedure planning in instructional videos, where the procedure length is not fixed or pre-determined.
1 code implementation • 22 Oct 2022 • Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang
Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable'' sentences to the video, and these sentences are possibly at different semantic scales.