Rethinking Fast Fourier Convolution in Image Inpainting

Recently proposed image inpainting method LaMa builds its network upon Fast Fourier Convolution (FFC), which was originally proposed for high-level vision tasks like image classification. FFC empowers the fully convolutional network to have a global receptive field in its early layers. Thanks to the unique character of the FFC module, LaMa has the ability to produce robust repeating texture, which can not be achieved by the previous inpainting methods. However, is the vanilla FFC module suitable for low-level vision tasks like image inpainting? In this paper, we analyze the fundamental flaws of using FFC in image inpainting, which are 1) spectrum shifting, 2) unexpected spatial activation, and 3) limited frequency receptive field. Such flaws make FFC-based inpainting framework difficult in generating complicated texture and performing faithful reconstruction. Based on the above analysis, we propose a novel Unbiased Fast Fourier Convolution (UFFC) module, which modifies the vanilla FFC module with 1) range transform and inverse transform, 2) absolute position embedding, 3) dynamic skip connection, and 4) adaptive clip, to overcome such flaws, achieving better inpainting results. Extensive experiments on several benchmark datasets demonstrate the effectiveness of our method, outperforming the state-of-the-art methods in both texture-capturing ability and expressiveness.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods