SWAGAN: A Style-based Wavelet-driven Generative Model

11 Feb 2021  ·  Rinon Gal, Dana Cohen, Amit Bermano, Daniel Cohen-Or ·

In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach yields enhancements in the visual quality of the generated images, and considerably increases computational performance. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to higher quality images with more realistic high-frequency content. Furthermore, we verify that our model's latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved downstream visual quality.

PDF Abstract


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Generation FFHQ 1024 x 1024 SWAGAN-Bi FID 4.06 # 11
Image Generation FFHQ 256 x 256 SWAGAN-Bi FID 5.22 # 19
Image Generation LSUN Churches 256 x 256 SWAGAN-Bi FID 4.97 # 15