FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery

We propose FineGAN, a novel unsupervised GAN framework, which disentangles the background, object shape, and object appearance to hierarchically generate images of fine-grained object categories. To disentangle the factors without supervision, our key idea is to use information theory to associate each factor to a latent code, and to condition the relationships between the codes in a specific way to induce the desired hierarchy. Through extensive experiments, we show that FineGAN achieves the desired disentanglement to generate realistic and diverse images belonging to fine-grained classes of birds, dogs, and cars. Using FineGAN's automatically learned features, we also cluster real images as a first attempt at solving the novel problem of unsupervised fine-grained object category discovery. Our code/models/demo can be found at https://github.com/kkanshul/finegan

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Generation CUB 128 x 128 FineGAN FID 11.25 # 2
Inception score 52.53 # 1
Image Clustering CUB Birds FineGAN Accuracy 0.126 # 1
NMI 0.403 # 1
Image Clustering Stanford Cars FineGAN Accuracy 0.078 # 1
NMI 0.354 # 1
Image Generation Stanford Cars FineGAN FID 16.03 # 2
Inception score 32.62 # 1
Image Clustering Stanford Dogs FineGAN Accuracy 0.079 # 1
NMI 0.233 # 1
Image Generation Stanford Dogs FineGAN FID 25.66 # 2
Inception score 46.92 # 1

Methods