FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation

Video interpolation aims to synthesize non-existent frames between two consecutive frames. Although existing optical flow based methods have achieved promising results, they still face great challenges in dealing with the interpolation of complicated dynamic scenes, which include occlusion, blur or abrupt brightness change. This is mainly because these cases may break the basic assumptions of the optical flow estimation (i.e. smoothness, consistency). In this work, we devised a novel structure-to-texture generation framework which splits the video interpolation task into two stages: structure-guided interpolation and texture refinement. In the first stage, deep structure-aware features are employed to predict feature flows from two consecutive frames to their intermediate result, and further generate the structure image of the intermediate frame. In the second stage, based on the generated coarse result, a Frame Texture Compensator is trained to fill in detailed textures. To the best of our knowledge, this is the first work that attempts to directly generate the intermediate frame through blending deep features. Experiments on both the benchmark datasets and challenging occlusion cases demonstrate the superiority of the proposed framework over the state-of-the-art methods. Codes are available on https://github.com/CM-BF/FeatureFlow.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Video Frame Interpolation MSU Video Frame Interpolation FeFlow_f PSNR 24.48 # 21
SSIM 0.902 # 19
VMAF 60.70 # 18
LPIPS 0.060 # 17
MS-SSIM 0.911 # 19
Video Frame Interpolation MSU Video Frame Interpolation FeFlow PSNR 23.28 # 23
SSIM 0.889 # 22
VMAF 58.11 # 23
LPIPS 0.070 # 20
MS-SSIM 0.894 # 22
Video Frame Interpolation X4K1000FPS FeFlow_f PSNR 25.16 # 15
SSIM 0.783 # 13
tOF 6.54 # 6
Video Frame Interpolation X4K1000FPS FeFlow PSNR 24.00 # 16
SSIM 0.756 # 15
tOF 6.59 # 7

Methods


No methods listed for this paper. Add relevant methods here