SwinIR: Image Restoration Using Swin Transformer

23 Aug 2021  ·  Jingyun Liang, JieZhang Cao, Guolei Sun, Kai Zhang, Luc van Gool, Radu Timofte ·

Image restoration is a long-standing low-level vision problem that aims to restore high-quality images from low-quality images (e.g., downscaled, noisy and compressed images). While state-of-the-art image restoration methods are based on convolutional neural networks, few attempts have been made with Transformers which show impressive performance on high-level vision tasks. In this paper, we propose a strong baseline model SwinIR for image restoration based on the Swin Transformer. SwinIR consists of three parts: shallow feature extraction, deep feature extraction and high-quality image reconstruction. In particular, the deep feature extraction module is composed of several residual Swin Transformer blocks (RSTB), each of which has several Swin Transformer layers together with a residual connection. We conduct experiments on three representative tasks: image super-resolution (including classical, lightweight and real-world image super-resolution), image denoising (including grayscale and color image denoising) and JPEG compression artifact reduction. Experimental results demonstrate that SwinIR outperforms state-of-the-art methods on different tasks by $\textbf{up to 0.14$\sim$0.45dB}$, while the total number of parameters can be reduced by $\textbf{up to 67%}$.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Grayscale Image Denoising BSD68 sigma15 SwinIR PSNR 31.97 # 2
Color Image Denoising Kodak24 sigma50 SwinIR PSNR 29.79 # 2
Image Super-Resolution Manga109 - 4x upscaling SwinIR+ (Training: DIV2K+Flickr2K) PSNR 32.22 # 4
SSIM 0.9273 # 4
Video Super-Resolution MSU Super-Resolution for Video Compression SwinIR + x265 BSQ-rate over ERQA 1.575 # 7
BSQ-rate over Subjective Score 0.346 # 5
BSQ-rate over VMAF 1.304 # 24
BSQ-rate over PSNR 8.13 # 45
BSQ-rate over MS-SSIM 4.641 # 57
BSQ-rate over LPIPS 1.474 # 23
Video Super-Resolution MSU Super-Resolution for Video Compression SwinIR + x264 BSQ-rate over ERQA 0.76 # 1
BSQ-rate over Subjective Score 0.304 # 3
BSQ-rate over VMAF 0.642 # 1
BSQ-rate over PSNR 6.268 # 32
BSQ-rate over MS-SSIM 0.736 # 9
BSQ-rate over LPIPS 0.559 # 1
Video Super-Resolution MSU Super-Resolution for Video Compression SwinIR + vvenc BSQ-rate over ERQA 6.624 # 31
BSQ-rate over Subjective Score 1.35 # 20
BSQ-rate over VMAF 0.887 # 15
BSQ-rate over PSNR 8.971 # 48
BSQ-rate over MS-SSIM 5.758 # 66
BSQ-rate over LPIPS 1.552 # 24
Video Super-Resolution MSU Super-Resolution for Video Compression SwinIR + uavs3e BSQ-rate over ERQA 6.803 # 34
BSQ-rate over Subjective Score 0.639 # 8
BSQ-rate over VMAF 1.848 # 42
BSQ-rate over PSNR 15.144 # 73
BSQ-rate over MS-SSIM 4.411 # 53
BSQ-rate over LPIPS 1.671 # 25
Video Super-Resolution MSU Super-Resolution for Video Compression SwinIR + aomenc BSQ-rate over ERQA 10.854 # 45
BSQ-rate over Subjective Score 0.835 # 14
BSQ-rate over VMAF 3.32 # 55
BSQ-rate over PSNR 15.144 # 73
BSQ-rate over MS-SSIM 7.105 # 75
BSQ-rate over LPIPS 4.566 # 39
Video Super-Resolution MSU Video Super Resolution Benchmark: Detail Restoration SwinIR Subjective score 4.799 # 23
ERQAv1.0 0.618 # 24
QRCRv1.0 0 # 21
SSIM 0.782 # 28
PSNR 25.12 # 26
FPS 0.407 # 24
1 - LPIPS 0.895 # 11
Video Super-Resolution MSU Video Upscalers: Quality Enhancement SwinIR-Real-S PSNR 28.55 # 31
LPIPS 0.189 # 7
SSIM 0.845 # 12
Video Super-Resolution MSU Video Upscalers: Quality Enhancement SwinIR-Real-B PSNR 28.86 # 28
LPIPS 0.183 # 4
SSIM 0.830 # 4
Image Super-Resolution Set14 - 4x upscaling SwinIR PSNR 29.15 # 4
SSIM 0.7958 # 8
Image Super-Resolution Set5 - 4x upscaling SwinIR PSNR 32.93 # 4
SSIM 0.9043 # 6
Image Super-Resolution Urban100 - 4x upscaling SwinIR PSNR 27.45 # 4
SSIM 0.8254 # 5
Color Image Denoising Urban100 sigma10 SwinIR PSNR 35.13 # 2
Grayscale Image Denoising Urban100 sigma15 SwinIR PSNR 33.70 # 2
Color Image Denoising Urban100 sigma25 SwinIR PSNR 32.9 # 2
Grayscale Image Denoising Urban100 sigma25 SwinIR PSNR 31.3 # 3
Color Image Denoising Urban100 sigma50 SwinIR PSNR 29.82 # 3
Grayscale Image Denoising Urban100 sigma50 SwinIR PSNR 27.98 # 3

Methods