Talking Head Generation
40 papers with code • 7 benchmarks • 3 datasets
Talking head generation is the task of generating a talking face from a set of images of a person.
( Image credit: Few-Shot Adversarial Learning of Realistic Neural Talking Head Models )
Most implemented papers
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation
To be specific, our framework consists of a speaker-independent stage and a speaker-specific stage.
Txt2Vid: Ultra-Low Bitrate Compression of Talking-Head Videos via Text
Video represents the majority of internet traffic today, driving a continual race between the generation of higher quality content, transmission of larger file sizes, and the development of network infrastructure.
Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion
As this keypoint based representation models the motions of facial regions, head, and backgrounds integrally, our method can better constrain the spatial and temporal consistency of the generated videos.
Live Speech Portraits: Real-Time Photorealistic Talking-Head Animation
The first stage is a deep neural network that extracts deep audio features along with a manifold projection to project the features to the target person's speech space.
AnimeCeleb: Large-Scale Animation CelebHeads Dataset for Head Reenactment
We present a novel Animation CelebHeads dataset (AnimeCeleb) to address an animation head reenactment.
AI-generated characters for supporting personalized learning and well-being
Advancements in machine learning have recently enabled the hyper-realistic synthesis of prose, images, audio and video data, in what is referred to as artificial intelligence (AI)-generated media.
Depth-Aware Generative Adversarial Network for Talking Head Video Generation
In a more dense way, the depth is also utilized to learn 3D-aware cross-modal (i. e. appearance and depth) attention to guide the generation of motion fields for warping source image representations.
Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer
This paper reports our solution for ACM Multimedia ViCo 2022 Conversational Head Generation Challenge, which aims to generate vivid face-to-face conversation videos based on audio and reference images.
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
Thus the facial radiance field can be flexibly adjusted to the new identity with few reference images.
Compressing Video Calls using Synthetic Talking Heads
We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver.