Search Results for author: Vinay Namboodiri

Found 25 papers, 10 papers with code

PhraseOut: A Code Mixed Data Augmentation Method for MultilingualNeural Machine Tranlsation

no code implementations • ICON 2020 • Binu Jasim, Vinay Namboodiri, C V Jawahar

Back-translation aug- ments parallel data by translating monolingual sentences in the target side to source language.

Data Augmentation Machine Translation +3

Paper
Add Code

Towards Accurate Lip-to-Speech Synthesis in-the-Wild

no code implementations • 2 Mar 2024 • Sindhu Hegde, Rudrabha Mukhopadhyay, C. V. Jawahar, Vinay Namboodiri

In this paper, we introduce a novel approach to address the task of synthesizing speech from silent videos of any in-the-wild speaker solely based on lip movements.

Language Modelling Lip to Speech Synthesis +1

Paper
Add Code

Dubbing for Everyone: Data-Efficient Visual Dubbing using Neural Rendering Priors

no code implementations • 11 Jan 2024 • Jack Saunders, Vinay Namboodiri

Our prior learning and adaptation method $\textbf{generalises to limited data}$ better and is more $\textbf{scalable}$ than existing person-specific models.

Neural Rendering

Paper
Add Code

FACTS: Facial Animation Creation using the Transfer of Styles

no code implementations • 18 Jul 2023 • Jack Saunders, Steven Caulkin, Vinay Namboodiri

The ability to accurately capture and express emotions is a critical aspect of creating believable characters in video games and other forms of entertainment.

Paper
Add Code

READ Avatars: Realistic Emotion-controllable Audio Driven Avatars

no code implementations • 1 Mar 2023 • Jack Saunders, Vinay Namboodiri

We present READ Avatars, a 3D-based approach for generating 2D avatars that are driven by audio input with direct and granular control over the emotion.

Paper
Add Code

Audio-Visual Face Reenactment

1 code implementation • 6 Oct 2022 • Madhav Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

The identity-aware generator takes the source image and the warped motion features as input to generate a high-quality output with fine-grained details.

Face Reenactment

154

Paper
Code

FaceOff: A Video-to-Video Face Swapping System

1 code implementation • 21 Aug 2022 • Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

To tackle this challenge, we introduce video-to-video (V2V) face-swapping, a novel task of face-swapping that can preserve (1) the identity and expressions of the source (actor) face video and (2) the background and pose of the target (double) video.

Face Swapping

Paper
Code

Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

no code implementations • 21 Aug 2022 • Aditya Agarwal, Bipasha Sen, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V Jawahar

Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost.

Lipreading Lip Reading

Paper
Add Code

Personalized One-Shot Lipreading for an ALS Patient

no code implementations • 2 Nov 2021 • Bipasha Sen, Aditya Agarwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

Apart from evaluating our approach on the ALS patient, we also extend it to people with hearing impairment relying extensively on lip movements to communicate.

Domain Adaptation Lipreading

Paper
Add Code

Towards Automatic Speech to Sign Language Generation

1 code implementation • 24 Jun 2021 • Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B Hegde, Vinay Namboodiri, C V Jawahar

Since the current datasets are inadequate for generating sign language directly from speech, we collect and release the first Indian sign language dataset comprising speech-level annotations, text transcripts, and the corresponding sign-language videos.

Text Generation

Paper
Code

Knowledge Consolidation based Class Incremental Online Learning with Limited Data

no code implementations • 12 Jun 2021 • Mohammed Asad Karim, Vinay Kumar Verma, Pravendra Singh, Vinay Namboodiri, Piyush Rai

In our approach, we learn robust representations that are generalizable across tasks without suffering from the problems of catastrophic forgetting and overfitting to accommodate future classes with limited samples.

Class Incremental Learning Incremental Learning +1

Paper
Add Code

Visual Speech Enhancement Without A Real Visual Stream

1 code implementation • 20 Dec 2020 • Sindhu B Hegde, K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we re-think the task of speech enhancement in unconstrained real-world environments.

Ranked #1 on Speech Denoising on LRS3+VGGSound

Denoising Speech Denoising +1

Paper
Code

STEER : Simple Temporal Regularization For Neural ODE

no code implementations • NeurIPS 2020 • Arnab Ghosh, Harkirat Behl, Emilien Dupont, Philip Torr, Vinay Namboodiri

Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive.

Time Series Time Series Analysis

Paper
Add Code

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

4 code implementations • 23 Aug 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

However, they fail to accurately morph the lip movements of arbitrary identities in dynamic, unconstrained talking face videos, resulting in significant parts of the video being out-of-sync with the new audio.

Ranked #1 on Unconstrained Lip-synchronization on LRS3 (using extra training data)

MORPH Unconstrained Lip-synchronization

9,179

Paper
Code

STEER: Simple Temporal Regularization For Neural ODEs

no code implementations • 18 Jun 2020 • Arnab Ghosh, Harkirat Singh Behl, Emilien Dupont, Philip H. S. Torr, Vinay Namboodiri

Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive.

Time Series Time Series Analysis

Paper
Add Code

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

1 code implementation • CVPR 2020 • K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C. V. Jawahar

In this work, we explore the task of lip to speech synthesis, i. e., learning to generate natural speech given only the lip movements of a speaker.

Ranked #1 on Lip Reading on LRW

Lip Reading Speaker-Specific Lip to Speech Synthesis +1

683

Paper
Code

CovidAID: COVID-19 Detection Using Chest X-Ray

5 code implementations • 21 Apr 2020 • Arpan Mangal, Surya Kalia, Harish Rajgopal, Krithika Rangarajan, Vinay Namboodiri, Subhashis Banerjee, Chetan Arora

This may be useful in an inpatient setting where the present systems are struggling to decide whether to keep the patient in the ward along with other patients or isolate them in COVID-19 areas.

Paper
Code

Towards Automatic Face-to-Face Translation

1 code implementation • ACM Multimedia, 2019 2019 • Prajwal K R, Rudrabha Mukhopadhyay, Jerin Philip, Abhishek Jha, Vinay Namboodiri, C. V. Jawahar

As today's digital communication becomes increasingly visual, we argue that there is a need for systems that can automatically translate a video of a person speaking in language A into a target language B with realistic lip synchronization.

Ranked #1 on Talking Face Generation on LRW (using extra training data)

Face to Face Translation Machine Translation +3

572

Paper
Code

CVIT's submissions to WAT-2019

no code implementations • WS 2019 • Jerin Philip, Shashank Siripragada, Upendra Kumar, Vinay Namboodiri, C. V. Jawahar

This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019.

Machine Translation Translation

Paper
Add Code

CPWC: Contextual Point Wise Convolution for Object Recognition

no code implementations • 21 Oct 2019 • Pratik Mazumder, Pravendra Singh, Vinay Namboodiri

We propose an alternative design for pointwise convolution, which uses spatial information from the input efficiently.

Object Object Recognition

Paper
Add Code

Multimodal Differential Network for Visual Question Generation

no code implementations • EMNLP 2018 • Badri Narayana Patro, S. Kumar, eep, Vinod Kumar Kurmi, Vinay Namboodiri

Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations.

Image Captioning Natural Questions +4

Paper
Add Code

Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator

1 code implementation • COLING 2018 • Badri Narayana Patro, Vinod Kumar Kurmi, S. Kumar, eep, Vinay Namboodiri

One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far.

Machine Reading Comprehension Machine Translation +5

Paper
Code

Multi-Agent Diverse Generative Adversarial Networks

1 code implementation • CVPR 2018 • Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri, Philip H. S. Torr, Puneet K. Dokania

Second, to enforce that different generators capture diverse high probability modes, the discriminator of MAD-GAN is designed such that along with finding the real and fake samples, it is also required to identify the generator that generated the given fake sample.

Face Generation Image-to-Image Translation +1

Paper
Code

Message Passing Multi-Agent GANs

no code implementations • 5 Dec 2016 • Arnab Ghosh, Viveka Kulharia, Vinay Namboodiri

As a first step towards this challenge, we introduce a novel framework for image generation: Message Passing Multi-Agent Generative Adversarial Networks (MPM GANs).

Image Generation

Paper
Add Code

Contextual RNN-GANs for Abstract Reasoning Diagram Generation

no code implementations • 29 Sep 2016 • Arnab Ghosh, Viveka Kulharia, Amitabha Mukerjee, Vinay Namboodiri, Mohit Bansal

Understanding, predicting, and generating object motions and transformations is a core problem in artificial intelligence.

Generative Adversarial Network Video Generation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.