1 code implementation • Applied Sciences 2024 • Yifang Xu, Yunzhuo Sun, Zien Xie, Benxiang Zhai, Sidan Du
Video temporal grounding (VTG) aims to locate specific temporal segments from an untrimmed video based on a linguistic query.
Ranked #1 on Zero-shot Moment Retrieval on QVHighlights
no code implementations • 3 Mar 2024 • Yifang Xu, Chenglei Peng, Ming Li, Yang Li, Sidan Du
Deep convolutional neural networks (DCNNs) have achieved great success in monocular depth estimation (MDE).
no code implementations • 3 Mar 2024 • Yunzhuo Sun, Yifang Xu, Zien Xie, Yukun Shu, Sidan Du
First, MiniGPT-4 is employed to generate the detailed description of the video frame and rewrite the query statement, fed into the encoder as new features.
no code implementations • 29 Apr 2023 • Yifang Xu, Yunzhuo Sun, Yang Li, Yilei Shi, Xiaoxiang Zhu, Sidan Du
With the increasing demand for video understanding, video moment and highlight detection (MHD) has emerged as a critical research topic.
no code implementations • 25 Jun 2021 • Zhicheng Cai, Chenglei Peng, Sidan Du
As Jitter point acting as a random factor, we actually add some randomness to the loss function, which is consistent with the fact that there exists innumerable random behaviors in the learning process of the machine learning model and is supposed to make the model more robust.