1 code implementation • 26 Oct 2023 • Xiao Liang, Tao Shi, Yaoyuan Liang, Te Tao, Shao-Lun Huang
In this paper, we propose DiffusionVG, a novel framework with diffusion models that formulates video grounding as a conditional generation task, where the target span is generated from Gaussian noise inputs and interatively refined in the reverse diffusion process.