High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

We present a new method for synthesizing high-resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (conditional GANs). Conditional GANs have enabled a variety of applications, but the results are often limited to low-resolution and still far from realistic. In this work, we generate 2048x1024 visually appealing results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures. Furthermore, we extend our framework to interactive visual manipulation with two additional features. First, we incorporate object instance segmentation information, which enables object manipulations such as removing/adding objects and changing the object category. Second, we propose a method to generate diverse results given the same input, allowing users to edit the object appearance interactively. Human opinion studies demonstrate that our method significantly outperforms existing methods, advancing both the quality and the resolution of deep image synthesis and editing.

PDF Abstract CVPR 2018 PDF CVPR 2018 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image-to-Image Translation ADE20K-Outdoor Labels-to-Photos pix2pixHD mIoU 17.4 # 4
Accuracy 71.6% # 3
FID 97.8 # 6
Image-to-Image Translation Cityscapes Labels-to-Photo pix2pixHD Per-pixel Accuracy 81.4% # 5
mIoU 58.3 # 8
FID 95 # 13
Sketch-to-Image Translation COCO-Stuff Pix2PixHD FID 38.7 # 2
FID-C 27.1 # 2
Fundus to Angiography Generation Fundus Fluorescein Angiogram Photographs & Colour Fundus Images of Diabetic Patients pix2pixHD FID 42.8 # 7
Kernel Inception Distance 0.00258 # 6

Results from Other Papers

Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Image-to-Image Translation ADE20K Labels-to-Photos pix2pixHD mIoU 20.3 # 8
Accuracy 69.2% # 6
FID 81.8 # 13
Image-to-Image Translation COCO-Stuff Labels-to-Photos pix2pixHD mIoU 14.6 # 7
Accuracy 45.8% # 5
FID 111.5 # 12