Lip Reading

45 papers with code • 3 benchmarks • 5 datasets

Lip Reading is a task to infer the speech content in a video by using only the visual information, especially the lip movements. It has many crucial applications in practice, such as assisting audio-based speech recognition, biometric authentication and aiding hearing-impaired people.

Source: Mutual Information Maximization for Effective Lip Reading

Most implemented papers

XFlow: Cross-modal Deep Neural Networks for Audiovisual Classification

catalina17/XFlow 2 Sep 2017

Our work improves on existing multimodal deep learning algorithms in two essential ways: (1) it presents a novel method for performing cross-modality (before features are learned from individual modalities) and (2) extends the previously proposed cross-connections which only transfer information between streams that process compatible data.

Lip2AudSpec: Speech reconstruction from silent lip movements video

hassanhub/LipReading 26 Oct 2017

In this study, we propose a deep neural network for reconstructing intelligible speech from silent lip movement videos.

End-to-End Speech-Driven Facial Animation with Temporal GANs

PrashanthaTP/wav2mov 23 May 2018

To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio.

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

Hangz-nju-cuhk/Talking-Face-Generation-DAVS 20 Jul 2018

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

zju-vipa/KamalEngine 26 Nov 2019

In this paper, we propose a new method, termed as Lip by Speech (LIBS), of which the goal is to strengthen lip reading by learning from speech recognizers.

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition

sailordiary/deep-face-vsr 6 Mar 2020

Recent advances in deep learning have heightened interest among researchers in the field of visual speech recognition (VSR).

Deformation Flow Based Two-Stream Network for Lip Reading

jingyunx/Deformation-Flow-Based-Two-stream-Network 12 Mar 2020

Observing on the continuity in adjacent frames in the speaking process, and the consistency of the motion patterns among different speakers when they pronounce the same phoneme, we model the lip movements in the speaking process as a sequence of apparent deformations in the lip region.

Mutual Information Maximization for Effective Lip Reading

xing96/MIM-lipreading 13 Mar 2020

By combining these two advantages together, the proposed method is expected to be both discriminative and robust for effective lip reading.

Synchronous Bidirectional Learning for Multilingual Lip Reading

luomingshuang/SBL_For_Multilingual_Lip_Reading 8 May 2020

Based on this idea, we try to explore the synergized learning of multilingual lip reading in this paper, and further propose a synchronous bidirectional learning (SBL) framework for effective synergy of multilingual lip reading.

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

Rudrabha/Lip2Wav CVPR 2020

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.