Search Results for author: Vinay Namboodiri

Found 25 papers, 10 papers with code

Towards Accurate Lip-to-Speech Synthesis in-the-Wild

no code implementations2 Mar 2024 Sindhu Hegde, Rudrabha Mukhopadhyay, C. V. Jawahar, Vinay Namboodiri

In this paper, we introduce a novel approach to address the task of synthesizing speech from silent videos of any in-the-wild speaker solely based on lip movements.

Language Modelling Lip to Speech Synthesis +1

Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors

no code implementations11 Jan 2024 Jack Saunders, Vinay Namboodiri

Our prior learning and adaptation method $\textbf{generalises to limited data}$ better and is more $\textbf{scalable}$ than existing person-specific models.

Neural Rendering

FACTS: Facial Animation Creation using the Transfer of Styles

no code implementations18 Jul 2023 Jack Saunders, Steven Caulkin, Vinay Namboodiri

The ability to accurately capture and express emotions is a critical aspect of creating believable characters in video games and other forms of entertainment.

READ Avatars: Realistic Emotion-controllable Audio Driven Avatars

no code implementations1 Mar 2023 Jack Saunders, Vinay Namboodiri

We present READ Avatars, a 3D-based approach for generating 2D avatars that are driven by audio input with direct and granular control over the emotion.

Audio-Visual Face Reenactment

1 code implementation6 Oct 2022 Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.

Face Reenactment

FaceOff: A Video-to-Video Face Swapping System

1 code implementation21 Aug 2022 Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

To tackle this challenge, we introduce video-to-video (V2V) face-swapping, a novel task of face-swapping that can preserve (1) the identity and expressions of the source (actor) face video and (2) the background and pose of the target (double) video.

Face Swapping

Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

no code implementations21 Aug 2022 Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V Jawahar

Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost.

Lipreading Lip Reading

Personalized One-Shot Lipreading for an ALS Patient

no code implementations2 Nov 2021 Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.

Domain Adaptation Lipreading

Towards Automatic Speech to Sign Language Generation

1 code implementation24 Jun 2021 Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar

Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.

Text Generation

Knowledge Consolidation based Class Incremental Online Learning with Limited Data

no code implementations12 Jun 2021 Mohammed Asad Karim, Vinay Kumar Verma, Pravendra Singh, Vinay Namboodiri, Piyush Rai

In our approach, we learn robust representations that are generalizable across tasks without suffering from the problems of catastrophic forgetting and overfitting to accommodate future classes with limited samples.

Class Incremental Learning Incremental Learning +1

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

4 code implementations23 Aug 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

 Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)

MORPH Unconstrained Lip-synchronization

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

1 code implementation CVPR 2020 K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

 Ranked #1 on Lip Reading on LRW

Lip Reading Speaker-Specific Lip to Speech Synthesis +1

CovidAID: COVID-19 Detection Using Chest X-Ray

5 code implementations21 Apr 2020 Arpan Mangal, Surya Kalia, Harish Rajgopal, Krithika Rangarajan, Vinay Namboodiri, Subhashis Banerjee, Chetan Arora

This may be useful in an inpatient setting where the present systems are struggling to decide whether to keep the patient in the ward along with other patients or isolate them in COVID-19 areas.

Towards Automatic Face-to-Face Translation

1 code implementation ACM Multimedia, 2019 2019 Prajwal K R, Rudrabha Mukhopadhyay, Jerin Philip, Abhishek Jha, Vinay Namboodiri, C. V. Jawahar

As today's digital communication becomes increasingly visual, we argue that there is a need for systems that can automatically translate a video of a person speaking in language A into a target language B with realistic lip synchronization.

 Ranked #1 on Talking Face Generation on LRW (using extra training data)

Face to Face Translation Machine Translation +3

CVIT's submissions to WAT-2019

no code implementations WS 2019 Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C. V. Jawahar

This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019.

Machine Translation Translation

CPWC: Contextual Point Wise Convolution for Object Recognition

no code implementations21 Oct 2019 Pratik Mazumder, Pravendra Singh, Vinay Namboodiri

We propose an alternative design for pointwise convolution, which uses spatial information from the input efficiently.

Object Object Recognition

Multimodal Differential Network for Visual Question Generation

no code implementations EMNLP 2018 Badri Narayana Patro, S. Kumar, eep, Vinod Kumar Kurmi, Vinay Namboodiri

Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations.

Image Captioning Natural Questions +4

Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator

1 code implementation COLING 2018 Badri Narayana Patro, Vinod Kumar Kurmi, S. Kumar, eep, Vinay Namboodiri

One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.

Machine Reading Comprehension Machine Translation +5

Multi-Agent Diverse Generative Adversarial Networks

1 code implementation CVPR 2018 Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania

Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample.

Face Generation Image-to-Image Translation +1

Message Passing Multi-Agent GANs

no code implementations5 Dec 2016 Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri

As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs).

Image Generation

Contextual RNN-GANs for Abstract Reasoning Diagram Generation

no code implementations29 Sep 2016 Arnab Ghosh, Viveka Kulharia, Amitabha Mukerjee, Vinay Namboodiri, Mohit Bansal

Understanding, predicting, and generating object motions and transformations is a core problem in artificial intelligence.

Generative Adversarial Network Video Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.