Face generation is the task of generating (or interpolating) new faces from an existing dataset.
The state-of-the-art results for this task are located in the Image Generation parent.
We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly fine details as training progresses.
SOTA for Image Generation on CIFAR-10
However, when people talk, the subtle movements of their face region are usually a complex combination of the intrinsic face appearance of the subject and also the extrinsic speech to be delivered. We assume the talking face sequence is actually a composition of both subject-related information and speech-related information.
To this end, we propose a deep Age-Invariant Model (AIM) for face recognition in the wild with three distinct novelties. Benchmarking our model on one of the most popular unconstrained face recognition datasets IJB-C additionally verifies the promising generalizability of AIM in recognizing faces in the wild.
We show that adversarial training can be used to learn a generative model with true observational and interventional distributions if the generator architecture is consistent with the given causal graph. The conditional GAN combined with a trained causal implicit generative model for the labels is then a causal implicit generative model over the labels and the generated image.
We propose a novel end-to-end semi-supervised adversarial framework to generate photorealistic face images of new identities with wide ranges of expressions, poses, and illuminations conditioned by a 3D morphable model. Previous adversarial style-transfer methods either supervise their networks with large volume of paired data or use unpaired data with a highly under-constrained two-way generative framework in an unsupervised fashion.
In this paper, we show empirical evidence of this effect, and propose a new loss to bridge the gap between the distributions of the input and target domains. To show this is effective, we incorporate the triple consistency loss into the training of a new landmark-guided face to face synthesis, where, contrary to previous works, the generated images can simultaneously undergo a large transformation in both expression and pose.
This paper presents a simple method for "do as I do" motion transfer: given a source video of a person dancing we can transfer that performance to a novel (amateur) target after only a few minutes of the target subject performing standard moves. We pose this problem as a per-frame image-to-image translation with spatio-temporal smoothing.
In this paper, we take a different approach, where we formulate the original problem as a stage-wise learning problem. We first synthesize the facial sketch corresponding to the visual attributes and then we reconstruct the face image based on the synthesized sketch.
Several works exist that attempt to perform high-level face-related analysis tasks based on landmarks. The primary aim of this work is to demonstrate that information preserved by landmarks (gender in particular) can be further accentuated by leveraging generative models to synthesize corresponding faces.
We present a new stage-wise learning paradigm for training generative adversarial networks (GANs). The goal of our work is to progressively strengthen the discriminator and thus, the generators, with each subsequent stage without changing the network architecture.