Constrained Lip-synchronization
6 papers with code • 0 benchmarks • 0 datasets
This task deals with lip-syncing a video (or) an image to the desired target speech. Approaches in this task work only for a specific (limited set) of identities, languages, speech/voice. See also: Unconstrained lip-synchronization - https://paperswithcode.com/task/lip-sync
Benchmarks
These leaderboards are used to track progress in Constrained Lip-synchronization
Most implemented papers
DeepFakes: a New Threat to Face Recognition? Assessment and Detection
The best performing method, which is based on visual quality metrics and is often used in presentation attack detection domain, resulted in 8. 97% equal error rate on high quality Deepfakes.
ObamaNet: Photo-realistic lip-sync from text
We present ObamaNet, the first architecture that generates both audio and synchronized photo-realistic lip-sync videos from any new text.
Talking Face Generation by Conditional Recurrent Adversarial Network
Given an arbitrary face image and an arbitrary speech clip, the proposed work attempts to generating the talking face video with accurate lip synchronization while maintaining smooth transition of both lip and facial movement over the entire video clip.
Dynamic Temporal Alignment of Speech to Lips
This alignment is based on deep audio-visual features, mapping the lips video and the speech signal to a shared representation.
Real-Time Lip Sync for Live 2D Animation
The emergence of commercial tools for real-time performance-based 2D animation has enabled 2D characters to appear on live broadcasts and streaming platforms.
Not made for each other- Audio-Visual Dissonance-based Deepfake Detection and Localization
MDS is computed as an aggregate of dissimilarity scores between audio and visual segments in a video.