Instance-Conditioned GAN

Generative Adversarial Networks (GANs) can generate near photo realistic images in narrow domains such as human faces. Yet, modeling complex distributions of datasets such as ImageNet and COCO-Stuff remains challenging in unconditional settings. In this paper, we take inspiration from kernel density estimation techniques and introduce a non-parametric approach to modeling distributions of complex datasets. We partition the data manifold into a mixture of overlapping neighborhoods described by a datapoint and its nearest neighbors, and introduce a model, called instance-conditioned GAN (IC-GAN), which learns the distribution around each datapoint. Experimental results on ImageNet and COCO-Stuff show that IC-GAN significantly improves over unconditional models and unsupervised data partitioning baselines. Moreover, we show that IC-GAN can effortlessly transfer to datasets not seen during training by simply changing the conditioning instances, and still generate realistic images. Finally, we extend IC-GAN to the class-conditional case and show semantically controllable generation and competitive quantitative results on ImageNet; while improving over BigGAN on ImageNet-LT. Code and trained models to reproduce the reported results are available at https://github.com/facebookresearch/ic_gan.

PDF Abstract NeurIPS 2021 PDF NeurIPS 2021 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Conditional Image Generation ImageNet 128x128 IC-GAN + DA FID 9.5 # 12
Inception score 108.6 # 7
Unconditional Image Generation ImageNet 128x128 IC-GAN + DA FID 11.7 # 1
Inception Score 48.7 # 1
Unconditional Image Generation ImageNet 128x128 Logo-GAN-AE Sage et al. (2018) FID 50.9 # 3
Inception Score 14.4 # 3
Unconditional Image Generation ImageNet 128x128 PGMGAN Armandpour et al. (2021) FID 21.7 # 2
Inception Score 23.3 # 2
Unconditional Image Generation ImageNet 128x128 PacGAN2 Lin et al. (2018) FID 57.5 # 4
Inception Score 13.5 # 4
Unconditional Image Generation ImageNet 256x256 ADM Dhariwal and Nichol (2021) FID 26.2 # 2
Inception Score 39.7 # 2
Unconditional Image Generation ImageNet 256x256 IC-GAN (chx96) + DA FID 15.6±0.1 # 1
Inception Score 59.0±0.4 # 1
Conditional Image Generation ImageNet 256x256 BigGAN+ [Brock et al.] (chx96) FID 8.1 # 4
Inception score 144.2 # 5
Conditional Image Generation ImageNet 256x256 IC-GAN (chx96) + DA FID 8.2±0.1 # 5
Inception score 173.8±0.9 # 4
Unconditional Image Generation ImageNet 64x64 Uncond. BigGAN FID 16.9 # 2
Inception Score 14.6±0.1 # 2
Unconditional Image Generation ImageNet 64x64 IC-GAN + DA FID 9.2 # 1
Inception Score 23.5±0.1 # 1
Conditional Image Generation ImageNet 64x64 IC-GAN + DA FID 6.7 # 1
Inception score 45.9±0.3 # 1
Conditional Image Generation ImageNet 64x64 BigGAN* [Brock et al.] +DA FID 10.2±0.1 # 4
Inception score 30.1±0.1 # 3

Methods