Audio

Audio-Visual Synchronization

8 papers with code • 0 benchmarks • 3 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio-Visual Synchronization

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Latest papers with no code

Most implemented Social Latest No code

CoAVT: A Cognition-Inspired Unified Audio-Visual-Text Pre-Training Model for Multimodal Processing

no code yet • 22 Jan 2024

To bridge the gap between modalities, CoAVT employs a query encoder, which contains a set of learnable query embeddings, and extracts the most informative audiovisual features of the corresponding text.

Paper
Add Code

Comparative Analysis of Deep-Fake Algorithms

no code yet • 6 Sep 2023

We examine the various deep learning-based approaches used for creating deepfakes, as well as the techniques used for detecting them.

Paper
Add Code

Audio-driven Talking Face Generation by Overcoming Unintended Information Flow

no code yet • 18 Jul 2023

Specifically, this involves unintended flow of lip, pose and other information from the reference to the generated image, as well as instabilities during model training.

Paper
Add Code

On the Audio-visual Synchronization for Lip-to-Speech Synthesis

no code yet • ICCV 2023

Most lip-to-speech (LTS) synthesis models are trained and evaluated under the assumption that the audio-video pairs in the dataset are perfectly synchronized.

Paper
Add Code

SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory

no code yet • 2 Nov 2022

It stores lip motion features from sequential ground truth images in the value memory and aligns them with corresponding audio features so that they can be retrieved using audio input at inference time.

Paper
Add Code

Rethinking Audio-visual Synchronization for Active Speaker Detection

no code yet • 21 Jun 2022

This clarification of definition is motivated by our extensive experiments, through which we discover that existing ASD methods fail in modeling the audio-visual synchronization and often classify unsynchronized videos as active speaking.

Paper
Add Code

Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation

no code yet • CVPR 2021

In this paper, we address the problem of separating individual speech signals from videos using audio-visual neural processing.

Paper
Add Code

Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning

no code yet • 13 Aug 2020

When watching videos, the occurrence of a visual event is often accompanied by an audio event, e. g., the voice of lip motion, the music of playing instruments.

Paper
Add Code

Identity-Preserving Realistic Talking Face Generation

no code yet • 25 May 2020

The necessary attributes of having a realistic face animation are 1) audio-visual synchronization (2) identity preservation of the target individual (3) plausible mouth movements (4) presence of natural eye blinks.

Paper
Add Code

Realistic Speech-Driven Facial Animation with GANs

no code yet • 14 Jun 2019

We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features.

Paper
Add Code

Audio-Visual Synchronization

Benchmarks Add a Result

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result