Each connection extracts the style feature of the latent feature maps in the encoder and then performs a residual learning based mapping function in the global information space guided by the target attributes.
In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.
Based on the semantic priors, we further propose a context-aware image inpainting model, which adaptively integrates global semantics and local features in a unified image generator.
Specifically, we take both the historical motion sequences and coarse prediction as input of our cascaded refinement network to predict refined human motion and strengthen the refinement network with adversarial error augmentation.
Caricature is an artistic drawing created to abstract or exaggerate facial features of a person.
Inspired by the widely-used structural similarity (SSIM) index in image quality assessment, we use the linear correlation between two images to quantify their structural similarity.
To be specific, our method explicitly requires the network to predict semantic segmentation as well as dilated affinity, which is a sparse version of pair-wise pixel affinity.
However, current state-of-the-art face parsing methods require large amounts of labeled data on the pixel-level and such process for caricature is tedious and labor-intensive.