1 code implementation • ICCV 2023 • Yaojie Shen, Xin Gu, Kai Xu, Heng Fan, Longyin Wen, Libo Zhang
Addressing this, we study video captioning from a different perspective in compressed domain, which brings multi-fold advantages over the existing pipeline: 1) Compared to raw images from the decoded video, the compressed video, consisting of I-frames, motion vectors and residuals, is highly distinguishable, which allows us to leverage the entire video for learning without manual sampling through a specialized model design; 2) The captioning model is more efficient in inference as smaller and less redundant information is processed.
Ranked #8 on Video Captioning on VATEX
1 code implementation • 27 Jul 2022 • Yaojie Shen, Libo Zhang, Kai Xu, Xiaojie Jin
First we learn the embedding of video transitions through a video transition classification task.