Lip Reading

45 papers with code • 3 benchmarks • 5 datasets

Lip Reading is a task to infer the speech content in a video by using only the visual information, especially the lip movements. It has many crucial applications in practice, such as assisting audio-based speech recognition, biometric authentication and aiding hearing-impaired people.

Source: Mutual Information Maximization for Effective Lip Reading

Latest papers with no code

Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder

no code yet • 8 Apr 2024

Automatic lip-reading (ALR) aims to automatically transcribe spoken content from a speaker's silent lip motion captured in video.

Landmark-Guided Cross-Speaker Lip Reading with Mutual Information Regularization

no code yet • 24 Mar 2024

Lip reading, the process of interpreting silent speech from visual lip movements, has gained rising attention for its wide range of realistic applications.

Cross-Attention Fusion of Visual and Geometric Features for Large Vocabulary Arabic Lipreading

no code yet • 18 Feb 2024

Lipreading involves using visual data to recognize spoken words by analyzing the movements of the lips and surrounding area.

Computation and Parameter Efficient Multi-Modal Fusion Transformer for Cued Speech Recognition

no code yet • 31 Jan 2024

Cued Speech (CS) is a pure visual coding method used by hearing-impaired people that combines lip reading with several specific hand shapes to make the spoken language visible.

Exploring Lip Segmentation Techniques in Computer Vision: A Comparative Analysis

no code yet • 20 Nov 2023

The findings contribute to the development of lightweight techniques and establish benchmarks for future advances in lip segmentation, especially in IoT and edge computing scenarios.

DualTalker: A Cross-Modal Dual Learning Approach for Speech-Driven 3D Facial Animation

no code yet • 8 Nov 2023

In recent years, audio-driven 3D facial animation has gained significant attention, particularly in applications such as virtual reality, gaming, and video conferencing.

End-to-End Lip Reading in Romanian with Cross-Lingual Domain Adaptation and Lateral Inhibition

no code yet • 7 Oct 2023

Lip reading or visual speech recognition has gained significant attention in recent years, particularly because of hardware development and innovations in computer vision.

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

no code yet • ICCV 2023

In order to mitigate the challenge, we try to learn general speech knowledge, the ability to model lip movements, from a high-resource language through the prediction of speech units.

Lip2Vec: Efficient and Robust Visual Speech Recognition via Latent-to-Latent Visual to Audio Representation Mapping

no code yet • ICCV 2023

Visual Speech Recognition (VSR) differs from the common perception tasks as it requires deeper reasoning over the video sequence, even by human experts.

Leveraging Visemes for Better Visual Speech Representation and Lip Reading

no code yet • 19 Jul 2023

We evaluate our approach on various tasks, including word-level and sentence-level lip reading, and audiovisual speech recognition using the Arman-AV dataset, a largescale Persian corpus.