LAPGAN

Introduced by Denton et al. in Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

A LAPGAN, or Laplacian Generative Adversarial Network, is a type of generative adversarial network that has a Laplacian pyramid representation. In the sampling procedure following training, we have a set of generative convnet models {$G_{0}, \dots , G_{K}$}, each of which captures the distribution of coefficients $h_{k}$ for natural images at a different level of the Laplacian pyramid. Sampling an image is akin to a reconstruction procedure, except that the generative models are used to produce the $h_{k}$’s:

$$ \tilde{I}_{k} = u\left(\tilde{I}_{k+1}\right) + \tilde{h}_{k} = u\left(\tilde{I}_{k+1}\right) + G_{k}\left(z_{k}, u\left(\tilde{I}_{k+1}\right)\right)$$

The recurrence starts by setting $\tilde{I}_{K+1} = 0$ and using the model at the final level $G_{K}$ to generate a residual image $\tilde{I}_{K}$ using noise vector $z_{K}$: $\tilde{I}_{K} = G_{K}\left(z_{K}\right)$. Models at all levels except the final are conditional generative models that take an upsampled version of the current image $\tilde{I}_{k+1}$ as a conditioning variable, in addition to the noise vector $z_{k}$.

The generative models {$G_{0}, \dots, G_{K}$} are trained using the CGAN approach at each level of the pyramid. Specifically, we construct a Laplacian pyramid from each training image $I$. At each level we make a stochastic choice (with equal probability) to either (i) construct the coefficients $h_{k}$ either using the standard Laplacian pyramid coefficient generation procedure or (ii) generate them using $G_{k}:

$$ \tilde{h}_{k} = G_{k}\left(z_{k}, u\left(I_{k+1}\right)\right) $$

Here $G_{k}$ is a convnet which uses a coarse scale version of the image $l_{k} = u\left(I_{k+1}\right)$ as an input, as well as noise vector $z_{k}$. $D_{k}$ takes as input $h_{k}$ or $\tilde{h}_{k}$, along with the low-pass image $l_{k}$ (which is explicitly added to $h_{k}$ or $\tilde{h}_{k}$ before the first convolution layer), and predicts if the image was real or generated. At the final scale of the pyramid, the low frequency residual is sufficiently small that it can be directly modeled with a standard GAN: $\tilde{h}_{K} = G_{K}\left(z_{K}\right)$ and $D_{K}$ only has $h_{K}$ or $\tilde{h}_{K}$ as input.

Breaking the generation into successive refinements is the key idea. We give up any “global” notion of fidelity; an attempt is never made to train a network to discriminate between the output of a cascade and a real image and instead the focus is on making each step plausible.

Source: Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Generation	2	15.38%
Image Augmentation	1	7.69%
Image Reconstruction	1	7.69%
Image Registration	1	7.69%
Image-to-Image Translation	1	7.69%
Medical Image Generation	1	7.69%
Translation	1	7.69%
Unsupervised Image-To-Image Translation	1	7.69%
BIG-bench Machine Learning	1	7.69%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
Laplacian Pyramid	Image Representations

Categories

Add Remove

Generative Models

Generative Adversarial Networks