LatentKeypointGAN: Controlling GANs via Latent Keypoints

29 Mar 2021  ยท  Xingzhe He, Bastian Wandt, Helge Rhodin ยท

Generative adversarial networks (GANs) have attained photo-realistic quality. However, it remains an open challenge of how to best control the image content... We introduce LatentKeypointGAN, a two-stage GAN that is trained end-to-end on the classical GAN objective yet internally conditioned on a set of sparse keypoints with associated appearance embeddings that respectively control the position and style of the generated objects and their parts. A major difficulty that we address with suitable network architectures and training schemes is disentangling the image into spatial and appearance factors without any supervision signals of either nor domain knowledge. We demonstrate that LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images by re-positioning and exchanging keypoint embeddings, such as combining the eyes, nose, and mouth from different images for generating portraits. In addition, the explicit generation of keypoints and matching images enables a new, GAN-based methodology for unsupervised keypoint detection. read more

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Unsupervised Facial Landmark Detection CelebA LatentKeypointGAN MSE normalized by inter-ocular distance 5.85% # 1
Image Quality Assessment CelebA-HQ LatentKeypointGAN FID-50k 11.94 # 1