A Flexible Recurrent Residual Pyramid Network for Video Frame Interpolation

ECCV 2020 · Haoxian Zhang, Yang Zhao, Ronggang Wang ·

Video frame interpolation (VFI) aims at synthesizing new video frames in-between existing frames to generate smoother high frame rate videos. Current methods usually use the fixed pre-trained networks to generate interpolated-frames for different resolutions and scenes. However, the fixed pre-trained networks are difficult to be tailored for a variety of cases. Inspired by classical pyramid energy minimization optical flow algorithms, this paper proposes a recurrent residual pyramid network (RRPN) for video frame interpolation. In the proposed network, different pyramid levels share the same weights and base-network, named recurrent residual layer (RRL). In RRL, residual displacements between warped images are detected to gradually refine optical flows rather than directly predict the flows or frames. Owing to the flexible recurrent residual pyramid architecture, we can customize the number of pyramid levels, and make trade-offs between calculations and quality based on the application scenarios. Moreover, occlusion masks are also generated in this recurrent residual way to solve occlusion better. Finally, a refinement network is added to enhance the details for final output with contextual and edge information. Experimental results demonstrate that the RRPN is more flexible and efficient than current VFI networks but has fewer parameters. In particular, the RRPN, which avoid over-reliance on datasets and network structures, shows superior performance for large motion cases.

PDF Abstract