no code implementations • ICON 2020 • Binu Jasim, Vinay Namboodiri, C V Jawahar
Back-translation aug- ments parallel data by translating monolingual sentences in the target side to source language.
no code implementations • 2 Mar 2024 • Sindhu Hegde, Rudrabha Mukhopadhyay, C. V. Jawahar, Vinay Namboodiri
In this paper, we introduce a novel approach to address the task of synthesizing speech from silent videos of any in-the-wild speaker solely based on lip movements.
no code implementations • 11 Jan 2024 • Jack Saunders, Vinay Namboodiri
Our prior learning and adaptation method $\textbf{generalises to limited data}$ better and is more $\textbf{scalable}$ than existing person-specific models.
no code implementations • 18 Jul 2023 • Jack Saunders, Steven Caulkin, Vinay Namboodiri
The ability to accurately capture and express emotions is a critical aspect of creating believable characters in video games and other forms of entertainment.
no code implementations • 1 Mar 2023 • Jack Saunders, Vinay Namboodiri
We present READ Avatars, a 3D-based approach for generating 2D avatars that are driven by audio input with direct and granular control over the emotion.
1 code implementation • 6 Oct 2022 • Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar
The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.
1 code implementation • 21 Aug 2022 • Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar
To tackle this challenge, we introduce video-to-video (V2V) face-swapping, a novel task of face-swapping that can preserve (1) the identity and expressions of the source (actor) face video and (2) the background and pose of the target (double) video.
no code implementations • 21 Aug 2022 • Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V Jawahar
Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost.
no code implementations • 2 Nov 2021 • Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar
Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.
1 code implementation • 24 Jun 2021 • Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar
Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.
no code implementations • 12 Jun 2021 • Mohammed Asad Karim, Vinay Kumar Verma, Pravendra Singh, Vinay Namboodiri, Piyush Rai
In our approach, we learn robust representations that are generalizable across tasks without suffering from the problems of catastrophic forgetting and overfitting to accommodate future classes with limited samples.
1 code implementation • 20 Dec 2020 • Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar
In this work, we re-think the task of speech enhancement in unconstrained real-world environments.
Ranked #1 on Speech Denoising on LRS3+VGGSound
no code implementations • NeurIPS 2020 • Arnab Ghosh, Harkirat Behl, Emilien Dupont, Philip Torr, Vinay Namboodiri
Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive.
4 code implementations • 23 Aug 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar
However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.
Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)
no code implementations • 18 Jun 2020 • Arnab Ghosh, Harkirat Singh Behl, Emilien Dupont, Philip H. S. Torr, Vinay Namboodiri
Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive.
1 code implementation • CVPR 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar
In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.
Ranked #1 on Lip Reading on LRW
5 code implementations • 21 Apr 2020 • Arpan Mangal, Surya Kalia, Harish Rajgopal, Krithika Rangarajan, Vinay Namboodiri, Subhashis Banerjee, Chetan Arora
This may be useful in an inpatient setting where the present systems are struggling to decide whether to keep the patient in the ward along with other patients or isolate them in COVID-19 areas.
1 code implementation • ACM Multimedia, 2019 2019 • Prajwal K R, Rudrabha Mukhopadhyay, Jerin Philip, Abhishek Jha, Vinay Namboodiri, C. V. Jawahar
As today's digital communication becomes increasingly visual, we argue that there is a need for systems that can automatically translate a video of a person speaking in language A into a target language B with realistic lip synchronization.
Ranked #1 on Talking Face Generation on LRW (using extra training data)
no code implementations • WS 2019 • Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C. V. Jawahar
This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019.
no code implementations • 21 Oct 2019 • Pratik Mazumder, Pravendra Singh, Vinay Namboodiri
We propose an alternative design for pointwise convolution, which uses spatial information from the input efficiently.
no code implementations • EMNLP 2018 • Badri Narayana Patro, S. Kumar, eep, Vinod Kumar Kurmi, Vinay Namboodiri
Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations.
1 code implementation • COLING 2018 • Badri Narayana Patro, Vinod Kumar Kurmi, S. Kumar, eep, Vinay Namboodiri
One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.
1 code implementation • CVPR 2018 • Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania
Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample.
no code implementations • 5 Dec 2016 • Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri
As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs).
no code implementations • 29 Sep 2016 • Arnab Ghosh, Viveka Kulharia, Amitabha Mukerjee, Vinay Namboodiri, Mohit Bansal
Understanding, predicting, and generating object motions and transformations is a core problem in artificial intelligence.