Lip Reading

45 papers with code • 3 benchmarks • 5 datasets

Lip Reading is a task to infer the speech content in a video by using only the visual information, especially the lip movements. It has many crucial applications in practice, such as assisting audio-based speech recognition, biometric authentication and aiding hearing-impaired people.

Source: Mutual Information Maximization for Effective Lip Reading

Benchmarks

Add a Result

These leaderboards are used to track progress in Lip Reading

Dataset	Best Model	Compare
GRID corpus (mixed-speech)	Lip2Wav	See all
TCD-TIMIT corpus (mixed-speech)	Lip2Wav	See all
LRW	Lip2Wav	See all

Datasets

Subtasks

Lip password classification

Most implemented papers

Most implemented Social Latest No code

XFlow: Cross-modal Deep Neural Networks for Audiovisual Classification

catalina17/XFlow • • 2 Sep 2017

Our work improves on existing multimodal deep learning algorithms in two essential ways: (1) it presents a novel method for performing cross-modality (before features are learned from individual modalities) and (2) extends the previously proposed cross-connections which only transfer information between streams that process compatible data.

Paper
Code

Lip2AudSpec: Speech reconstruction from silent lip movements video

hassanhub/LipReading • • 26 Oct 2017

In this study, we propose a deep neural network for reconstructing intelligible speech from silent lip movement videos.

Paper
Code

End-to-End Speech-Driven Facial Animation with Temporal GANs

PrashanthaTP/wav2mov • • 23 May 2018

To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio.

Paper
Code

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

Hangz-nju-cuhk/Talking-Face-Generation-DAVS • • 20 Jul 2018

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

Paper
Code

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

zju-vipa/KamalEngine • • 26 Nov 2019

In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers.

Paper
Code

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition

sailordiary/deep-face-vsr • • 6 Mar 2020

Recent advances in deep learning have heightened interest among researchers in the field of visual speech recognition (VSR).

Paper
Code

Deformation Flow Based Two-Stream Network for Lip Reading

jingyunx/Deformation-Flow-Based-Two-stream-Network • • 12 Mar 2020

Observing on the continuity in adjacent frames in the speaking process, and the consistency of the motion patterns among different speakers when they pronounce the same phoneme, we model the lip movements in the speaking process as a sequence of apparent deformations in the lip region.

Paper
Code

Mutual Information Maximization for Effective Lip Reading

xing96/MIM-lipreading • • 13 Mar 2020

By combining these two advantages together, the proposed method is expected to be both discriminative and robust for effective lip reading.

Paper
Code

Synchronous Bidirectional Learning for Multilingual Lip Reading

luomingshuang/SBL_For_Multilingual_Lip_Reading • • 8 May 2020

Based on this idea, we try to explore the synergized learning of multilingual lip reading in this paper, and further propose a synchronous bidirectional learning (SBL) framework for effective synergy of multilingual lip reading.

Paper
Code

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

Rudrabha/Lip2Wav • • CVPR 2020

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

Paper
Code

Lip Reading

Benchmarks Add a Result

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result