Talking Face Generation

20 papers with code • 1 benchmarks • 4 datasets

Talking face generation aims to synthesize a sequence of face images that correspond to given speech semantics.

( Image credit: Talking Face Generation by Adversarially Disentangled Audio-Visual Representation )

Most implemented papers

MakeItTalk: Speaker-Aware Talking-Head Animation

yzhou359/MakeItTalk 27 Apr 2020

We present a method that generates expressive talking heads from a single facial image with audio as the only input.

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

Rudrabha/Wav2Lip 23 Aug 2020

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

007: Democratically Finding The Cause of Packet Drops

behnazak/Vigil-007SourceCode 20 Feb 2018

Network failures continue to plague datacenter operators as their symptoms may not have direct correlation with where or why they occur.

Talking Face Generation by Conditional Recurrent Adversarial Network

susanqq/Talking_Face_Generation 13 Apr 2018

Given an arbitrary face image and an arbitrary speech clip, the proposed work attempts to generating the talking face video with accurate lip synchronization while maintaining smooth transition of both lip and facial movement over the entire video clip.

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

Hangz-nju-cuhk/Talking-Face-Generation-DAVS 20 Jul 2018

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

ReenactGAN: Learning to Reenact Faces via Boundary Transfer

wywu/ReenactGAN ECCV 2018

A transformer is subsequently used to adapt the boundary of source face to the boundary of target face.

Capture, Learning, and Synthesis of 3D Speaking Styles

TimoBolkart/voca CVPR 2019

To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers.

Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss

lelechen63/ATVGnet CVPR 2019

We devise a cascade GAN approach to generate talking face video, which is robust to different face shapes, view angles, facial characteristics, and noisy audio conditions.

Neural Voice Puppetry: Audio-driven Facial Reenactment

miu200521358/NeuralVoicePuppetryMMD ECCV 2020

Neural Voice Puppetry has a variety of use-cases, including audio-driven video avatars, video dubbing, and text-driven video synthesis of a talking head.

Speech Driven Talking Face Generation from a Single Image and an Emotion Condition

eeskimez/emotalkingface 8 Aug 2020

Visual emotion expression plays an important role in audiovisual speech communication.