Deep Video Inpainting Detection

26 Jan 2021  ·  Peng Zhou, Ning Yu, Zuxuan Wu, Larry S. Davis, Abhinav Shrivastava, Ser-Nam Lim ·

This paper studies video inpainting detection, which localizes an inpainted region in a video both spatially and temporally. In particular, we introduce VIDNet, Video Inpainting Detection Network, which contains a two-stream encoder-decoder architecture with attention module. To reveal artifacts encoded in compression, VIDNet additionally takes in Error Level Analysis frames to augment RGB frames, producing multimodal features at different levels with an encoder. Exploring spatial and temporal relationships, these features are further decoded by a Convolutional LSTM to predict masks of inpainted regions. In addition, when detecting whether a pixel is inpainted or not, we present a quad-directional local attention module that borrows information from its surrounding pixels from four directions. Extensive experiments are conducted to validate our approach. We demonstrate, among other things, that VIDNet not only outperforms by clear margins alternative inpainting detection methods but also generalizes well on novel videos that are unseen during training.

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.