Search Results for author: K R Prajwal

Found 9 papers, 5 papers with code

A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

no code implementations16 May 2024 Charles Raude, K R Prajwal, Liliane Momeni, Hannah Bull, Samuel Albanie, Andrew Zisserman, Gül Varol

To this end, we introduce a multi-task Transformer model, CSLR2, that is able to ingest a signing sequence and output in a joint embedding space between signed language and spoken language text.

Retrieval Sign Language Recognition +1

Weakly-supervised Fingerspelling Recognition in British Sign Language Videos

1 code implementation16 Nov 2022 K R Prajwal, Hannah Bull, Liliane Momeni, Samuel Albanie, Gül Varol, Andrew Zisserman

Through extensive evaluations, we verify our method for automatic annotation and our model architecture.

Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild

no code implementations1 Sep 2022 Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

With the help of multiple powerful discriminators that guide the training process, our generator learns to synthesize speech sequences in any voice for the lip movements of any person.

Lip to Speech Synthesis Speech Synthesis

Automatic dense annotation of large-vocabulary sign language videos

no code implementations4 Aug 2022 Liliane Momeni, Hannah Bull, K R Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman

Recently, sign language researchers have turned to sign language interpreted TV broadcasts, comprising (i) a video of continuous signing and (ii) subtitles corresponding to the audio content, as a readily available and large-scale source of training data.

Visual Keyword Spotting with Attention

1 code implementation29 Oct 2021 K R Prajwal, Liliane Momeni, Triantafyllos Afouras, Andrew Zisserman

In this paper, we consider the task of spotting spoken keywords in silent video sequences -- also known as visual keyword spotting.

Lip Reading Visual Keyword Spotting

Sub-word Level Lip Reading With Visual Attention

no code implementations CVPR 2022 K R Prajwal, Triantafyllos Afouras, Andrew Zisserman

To this end, we make the following contributions: (1) we propose an attention-based pooling mechanism to aggregate visual speech representations; (2) we use sub-word units for lip reading for the first time and show that this allows us to better model the ambiguities of the task; (3) we propose a model for Visual Speech Detection (VSD), trained on top of the lip reading network.

 Ranked #1 on Visual Speech Recognition on LRS2 (using extra training data)

Audio-Visual Active Speaker Detection Automatic Speech Recognition +5

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

4 code implementations23 Aug 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

 Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)

MORPH Unconstrained Lip-synchronization

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

1 code implementation CVPR 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

 Ranked #1 on Lip Reading on LRW

Lip Reading Speaker-Specific Lip to Speech Synthesis +1

Cannot find the paper you are looking for? You can Submit a new open access paper.