Lip Reading

45 papers with code • 3 benchmarks • 5 datasets

Lip Reading is a task to infer the speech content in a video by using only the visual information, especially the lip movements. It has many crucial applications in practice, such as assisting audio-based speech recognition, biometric authentication and aiding hearing-impaired people.

Source: Mutual Information Maximization for Effective Lip Reading

Benchmarks

Add a Result

These leaderboards are used to track progress in Lip Reading

Dataset	Best Model	Compare
GRID corpus (mixed-speech)	Lip2Wav	See all
TCD-TIMIT corpus (mixed-speech)	Lip2Wav	See all
LRW	Lip2Wav	See all

Datasets

Subtasks

Lip password classification

Latest papers with no code

Most implemented Social Latest No code

Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder

no code yet • 8 Apr 2024

Automatic lip-reading (ALR) aims to automatically transcribe spoken content from a speaker's silent lip motion captured in video.

Paper
Add Code

Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization

no code yet • 24 Mar 2024

Lip reading, the process of interpreting silent speech from visual lip movements, has gained rising attention for its wide range of realistic applications.

Paper
Add Code

Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic Lipreading

no code yet • 18 Feb 2024

Lipreading involves using visual data to recognize spoken words by analyzing the movements of the lips and surrounding area.

Paper
Add Code

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition

no code yet • 31 Jan 2024

Cued Speech (CS) is a pure visual coding method used by hearing-impaired people that combines lip reading with several specific hand shapes to make the spoken language visible.

Paper
Add Code

Exploring Lip Segmentation Techniques in Computer Vision: A Comparative Analysis

no code yet • 20 Nov 2023

The findings contribute to the development of lightweight techniques and establish benchmarks for future advances in lip segmentation, especially in IoT and edge computing scenarios.

Paper
Add Code

DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

no code yet • 8 Nov 2023

In recent years, audio-driven 3D facial animation has gained significant attention, particularly in applications such as virtual reality, gaming, and video conferencing.

Paper
Add Code

End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition

no code yet • 7 Oct 2023

Lip reading or visual speech recognition has gained significant attention in recent years, particularly because of hardware development and innovations in computer vision.

Paper
Add Code

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

no code yet • ICCV 2023

In order to mitigate the challenge, we try to learn general speech knowledge, the ability to model lip movements, from a high-resource language through the prediction of speech units.

Paper
Add Code

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

no code yet • ICCV 2023

Visual Speech Recognition (VSR) differs from the common perception tasks as it requires deeper reasoning over the video sequence, even by human experts.

Paper
Add Code

Leveraging Visemes for Better Visual Speech Representation and Lip Reading

no code yet • 19 Jul 2023

We evaluate our approach on various tasks, including word-level and sentence-level lip reading, and audiovisual speech recognition using the Arman-AV dataset, a largescale Persian corpus.

Paper
Add Code

Lip Reading

Benchmarks Add a Result

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result