Search Results for author: Rudrabha Mukhopadhyay

Found 15 papers, 9 papers with code

Towards Accurate Lip-to-Speech Synthesis in-the-Wild

no code implementations2 Mar 2024 Sindhu Hegde, Rudrabha Mukhopadhyay, C. V. Jawahar, Vinay Namboodiri

In this paper, we introduce a novel approach to address the task of synthesizing speech from silent videos of any in-the-wild speaker solely based on lip movements.

Language Modelling Lip to Speech Synthesis +1

Compressing Video Calls using Synthetic Talking Heads

1 code implementation7 Oct 2022 Madhav Agarwal, Anchit Gupta, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C V Jawahar

We use a state-of-the-art face reenactment network to detect key points in the non-pivot frames and transmit them to the receiver.

Face Reenactment Talking Head Generation +1

Audio-Visual Face Reenactment

1 code implementation6 Oct 2022 Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.

Face Reenactment

Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild

no code implementations1 Sep 2022 Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

With the help of multiple powerful discriminators that guide the training process, our generator learns to synthesize speech sequences in any voice for the lip movements of any person.

Lip to Speech Synthesis Speech Synthesis

Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

no code implementations21 Aug 2022 Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V Jawahar

Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost.

Lipreading Lip Reading

FaceOff: A Video-to-Video Face Swapping System

1 code implementation21 Aug 2022 Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

To tackle this challenge, we introduce video-to-video (V2V) face-swapping, a novel task of face-swapping that can preserve (1) the identity and expressions of the source (actor) face video and (2) the background and pose of the target (double) video.

Face Swapping

Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors

1 code implementation17 Aug 2022 Sindhu B Hegde, Rudrabha Mukhopadhyay, Vinay P Namboodiri, C. V. Jawahar

We show that when we process this $8\times8$ video with the right set of audio and image priors, we can obtain a full-length, $256\times256$ video.

Super-Resolution Video Compression

Personalized One-Shot Lipreading for an ALS Patient

no code implementations2 Nov 2021 Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.

Domain Adaptation Lipreading

Towards Automatic Speech to Sign Language Generation

1 code implementation24 Jun 2021 Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar

Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.

Text Generation

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

4 code implementations23 Aug 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

 Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)

MORPH Unconstrained Lip-synchronization

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

1 code implementation CVPR 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

 Ranked #1 on Lip Reading on LRW

Lip Reading Speaker-Specific Lip to Speech Synthesis +1

IndicSpeech: Text-to-Speech Corpus for Indian Languages

no code implementations LREC 2020 Nimisha Srivastava, Rudrabha Mukhopadhyay, Prajwal K R, C. V. Jawahar

We believe that one of the major reasons for this is the lack of large, publicly available text-to-speech corpora in these languages that are suitable for training neural text-to-speech systems.

Towards Automatic Face-to-Face Translation

1 code implementation ACM Multimedia, 2019 2019 Prajwal K R, Rudrabha Mukhopadhyay, Jerin Philip, Abhishek Jha, Vinay Namboodiri, C. V. Jawahar

As today's digital communication becomes increasingly visual, we argue that there is a need for systems that can automatically translate a video of a person speaking in language A into a target language B with realistic lip synchronization.

 Ranked #1 on Talking Face Generation on LRW (using extra training data)

Face to Face Translation Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.