Image Shape Manipulation from a Single Augmented Training Sample

In this paper, we present DeepSIM, a generative model for conditional image manipulation based on a single image. We find that extensive augmentation is key for enabling single image training, and incorporate the use of thin-plate-spline (TPS) as an effective augmentation. Our network learns to map between a primitive representation of the image to the image itself. The choice of a primitive representation has an impact on the ease and expressiveness of the manipulations and can be automatic (e.g. edges), manual (e.g. segmentation) or hybrid such as edges on top of segmentations. At manipulation time, our generator allows for making complex image changes by modifying the primitive input representation and mapping it through the network. Our method is shown to achieve remarkable performance on image manipulation tasks.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Manipulation LRS2 TPS LPIPS (S1) 0.12 # 1
SIFID (S1) 0.07 # 1
LPIPS (S2) 0.21 # 1
SIFID (S2) 0.12 # 1
LPIPS (S3) 0.1 # 1
SIFID (S3) 0.04 # 1
LPIPS (S4) 0.22 # 1
SIFID (S4) 0.12 # 1
LPIPS (S5) 0.14 # 1
SIFID (S5) 0.06 # 1
Image Manipulation LRS2 Pix2PixHD-SIA LPIPS (S1) 0.44 # 2
SIFID (S1) 0.51 # 2
LPIPS (S2) 0.47 # 2
SIFID (S2) 0.49 # 2
LPIPS (S3) 0.41 # 2
SIFID (S3) 0.5 # 2
LPIPS (S4) 0.53 # 2
SIFID (S4) 0.26 # 2
LPIPS (S5) 0.46 # 2
SIFID (S5) 0.44 # 2

Methods