Efficient Long-Range Attention Network for Image Super-resolution

13 Mar 2022  ·  Xindong Zhang, Hui Zeng, Shi Guo, Lei Zhang ·

Recently, transformer-based methods have demonstrated impressive results in various vision tasks, including image super-resolution (SR), by exploiting the self-attention (SA) for feature extraction. However, the computation of SA in most existing transformer based models is very expensive, while some employed operations may be redundant for the SR task. This limits the range of SA computation and consequently the SR performance. In this work, we propose an efficient long-range attention network (ELAN) for image SR. Specifically, we first employ shift convolution (shift-conv) to effectively extract the image local structural information while maintaining the same level of complexity as 1x1 convolution, then propose a group-wise multi-scale self-attention (GMSA) module, which calculates SA on non-overlapped groups of features using different window sizes to exploit the long-range image dependency. A highly efficient long-range attention block (ELAB) is then built by simply cascading two shift-conv with a GMSA module, which is further accelerated by using a shared attention mechanism. Without bells and whistles, our ELAN follows a fairly simple design by sequentially cascading the ELABs. Extensive experiments demonstrate that ELAN obtains even better results against the transformer-based SR models but with significantly less complexity. The source code can be found at https://github.com/xindongzhang/ELAN.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Super-Resolution Manga109 - 4x upscaling ELAN PSNR 31.68 # 16
SSIM 0.9226 # 11
Image Super-Resolution Set14 - 4x upscaling ELAN-light PSNR 28.78 # 34
SSIM 0.7858 # 35
Image Super-Resolution Set14 - 4x upscaling ELAN PSNR 28.96 # 19
SSIM 0.7914 # 19

Methods