Revisiting Temporal Modeling for Video Super-resolution

13 Aug 2020  ยท  Takashi Isobe, Fang Zhu, Xu Jia, Shengjin Wang ยท

Video super-resolution plays an important role in surveillance video analysis and ultra-high-definition video display, which has drawn much attention in both the research and industrial communities. Although many deep learning-based VSR methods have been proposed, it is hard to directly compare these methods since the different loss functions and training datasets have a significant impact on the super-resolution results. In this work, we carefully study and compare three temporal modeling methods (2D CNN with early fusion, 3D CNN with slow fusion and Recurrent Neural Network) for video super-resolution. We also propose a novel Recurrent Residual Network (RRN) for efficient video super-resolution, where residual learning is utilized to stabilize the training of RNN and meanwhile to boost the super-resolution performance. Extensive experiments show that the proposed RRN is highly computational efficiency and produces temporal consistent VSR results with finer details than other temporal modeling methods. Besides, the proposed method achieves state-of-the-art results on several widely used benchmarks.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Super-Resolution MSU Video Super Resolution Benchmark: Detail Restoration RRN-5L Subjective score 5.02 # 20
ERQAv1.0 0.617 # 25
QRCRv1.0 0.549 # 14
SSIM 0.789 # 27
PSNR 23.786 # 32
FPS 2.74 # 5
1 - LPIPS 0.856 # 23
Video Super-Resolution MSU Video Super Resolution Benchmark: Detail Restoration RRN-10L Subjective score 5.35 # 15
ERQAv1.0 0.627 # 23
QRCRv1.0 0.557 # 10
SSIM 0.79 # 26
PSNR 24.252 # 30
FPS 2.567 # 6
1 - LPIPS 0.842 # 25
Video Super-Resolution SPMCS - 4x upscaling RRN-L PSNR 29.84 # 1
SSIM 0.8690 # 1
Video Super-Resolution UDM10 - 4x upscaling RRN-L PSNR 38.97 # 7
SSIM 0.9534 # 7
Video Super-Resolution Vid4 - 4x upscaling - BD degradation RRN PSNR 27.69 # 11
SSIM 0.8488 # 11

Methods