Pixel2Style2Pixel, or pSp, is an image-to-image translation framework that is based on a novel encoder that directly generates a series of style vectors which are fed into a pretrained StyleGAN generator, forming the extended $\mathcal{W+}$ latent space. Feature maps are first extracted using a standard feature pyramid over a ResNet backbone. Then, for each of $18$ target styles, a small mapping network is trained to extract the learned styles from the corresponding feature map, where styles $(0-2)$ are generated from the small feature map, $(3-6)$ from the medium feature map, and $(7-18)$ from the largest feature map. The mapping network, map2style, is a small fully convolutional network, which gradually reduces spatial size using a set of 2-strided convolutions followed by LeakyReLU activations. Each generated 512 vector, is fed into StyleGAN, starting from its matching affine transformation, $A$.
Source: Encoding in Style: a StyleGAN Encoder for Image-to-Image TranslationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Super-Resolution | 1 | 20.00% |
Conditional Image Generation | 1 | 20.00% |
Face Generation | 1 | 20.00% |
Image-to-Image Translation | 1 | 20.00% |
Translation | 1 | 20.00% |
Component | Type |
|
---|---|---|
Leaky ReLU
|
Activation Functions | |
ResNet
|
Convolutional Neural Networks | |
StyleGAN
|
Generative Models |