Face Generation
120 papers with code • 0 benchmarks • 4 datasets
Face generation is the task of generating (or interpolating) new faces from an existing dataset.
The state-of-the-art results for this task are located in the Image Generation parent.
( Image credit: Progressive Growing of GANs for Improved Quality, Stability, and Variation )
Benchmarks
These leaderboards are used to track progress in Face Generation
Libraries
Use these libraries to find Face Generation models and implementationsSubtasks
Latest papers with no code
Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation
In the task of talking face generation, the objective is to generate a face video with lips synchronized to the corresponding audio while preserving visual details and identity information.
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting
We present GStalker, a 3D audio-driven talking face generation model with Gaussian Splatting for both fast training (40 minutes) and real-time rendering (125 FPS) with a 3$\sim$5 minute video for training material, in comparison with previous 2D and 3D NeRF-based modeling frameworks which require hours of training and seconds of rendering per frame.
TextGaze: Gaze-Controllable Face Generation with Natural Language
Our work first introduces a text-of-gaze dataset containing over 90k text descriptions spanning a dense distribution of gaze and head poses.
Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control
This work presents Sketch2Human, the first system for controllable full-body human image generation guided by a semantic sketch (for geometry control) and a reference image (for appearance control).
Adversarial Identity Injection for Semantic Face Image Synthesis
Among all the explored techniques, Semantic Image Synthesis (SIS) methods, whose goal is to generate an image conditioned on a semantic segmentation mask, are the most promising, even though preserving the perceived identity of the input subject is not their main concern.
SphereHead: Stable 3D Full-head Synthesis with Spherical Tri-plane Representation
We further introduce a view-image consistency loss for the discriminator to emphasize the correspondence of the camera parameters and the images.
DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
While large-scale pre-trained text-to-image models can synthesize diverse and high-quality human-centered images, novel challenges arise with a nuanced task of "identity fine editing": precisely modifying specific features of a subject while maintaining its inherent identity and context.
Superior and Pragmatic Talking Face Generation with Teacher-Student Framework
Talking face generation technology creates talking videos from arbitrary appearance and motion signal, with the "arbitrary" offering ease of use but also introducing challenges in practical applications.
FlowVQTalker: High-Quality Emotional Talking Face Generation through Normalizing Flow and Quantization
Specifically, we develop a flow-based coefficient generator that encodes the dynamics of facial emotion into a multi-emotion-class latent space represented as a mixture distribution.
Style2Talker: High-Resolution Talking Head Generation with Emotion Style and Art Style
Although automatically animating audio-driven talking heads has recently received growing interest, previous efforts have mainly concentrated on achieving lip synchronization with the audio, neglecting two crucial elements for generating expressive videos: emotion style and art style.