Coherent Semantic Attention for Image Inpainting

ICCV 2019  ·  Hongyu Liu, Bin Jiang, Yi Xiao, Chao Yang ·

The latest deep learning-based approaches have shown promising results for the challenging task of inpainting missing regions of an image. However, the existing methods often generate contents with blurry textures and distorted structures due to the discontinuity of the local pixels. From a semantic-level perspective, the local pixel discontinuity is mainly because these methods ignore the semantic relevance and feature continuity of hole regions. To handle this problem, we investigate the human behavior in repairing pictures and propose a fined deep generative model-based approach with a novel coherent semantic attention (CSA) layer, which can not only preserve contextual structure but also make more effective predictions of missing parts by modeling the semantic relevance between the holes features. The task is divided into rough, refinement as two steps and model each step with a neural network under the U-Net architecture, where the CSA layer is embedded into the encoder of refinement step. To stabilize the network training process and promote the CSA layer to learn more effective parameters, we propose a consistency loss to enforce the both the CSA layer and the corresponding layer of the CSA in decoder to be close to the VGG feature layer of a ground truth image simultaneously. The experiments on CelebA, Places2, and Paris StreetView datasets have validated the effectiveness of our proposed methods in image inpainting tasks and can obtain images with a higher quality as compared with the existing state-of-the-art approaches.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract


Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Inpainting Paris StreetView Coherent Semantic Attention for Image Inpainting 40-50% Mask PSNR 23.10 # 1
10-20% Mask PSNR 32.67 # 1
20-30% Mask PSNR 30.32 # 1
30-40% Mask PSNR 24.85 # 1