Search Results for author: Ravi Teja Gadde

Found 7 papers, 2 papers with code

Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization

1 code implementation18 Aug 2023 Soumik Mukhopadhyay, Saksham Suri, Ravi Teja Gadde, Abhinav Shrivastava

We show results on both reconstruction (same audio-video inputs) as well as cross (different audio-video inputs) settings on Voxceleb2 and LRW datasets.

SIDGAN: High-Resolution Dubbed Video Generation via Shift-Invariant Learning

no code implementations ICCV 2023 Urwa Muaz, WonDong Jang, Rohun Tripathi, Santhosh Mani, Wenbin Ouyang, Ravi Teja Gadde, Baris Gecer, Sergio Elizondo, Reza Madad, Naveen Nair

Dubbed video generation aims to accurately synchronize mouth movements of a given facial video with driving audio while preserving identity and scene-specific visual dynamics, such as head pose and lighting.

Image Generation Video Generation +1

Prompt Tuning GPT-2 language model for parameter-efficient domain adaptation of ASR systems

no code implementations16 Dec 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Automatic Speech Recognition (ASR) systems have found their use in numerous industrial applications in very diverse domains creating a need to adapt to new domains with small memory and deployment overhead.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Prompt-tuning in ASR systems for efficient domain-adaptation

no code implementations13 Oct 2021 Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

In this work, we overcome the problem using prompt-tuning, a methodology that trains a small number of domain token embedding parameters to prime a transformer-based LM to a particular domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Towards Continual Entity Learning in Language Models for Conversational Agents

no code implementations30 Jul 2021 Ravi Teja Gadde, Ivan Bulyko

Neural language models (LM) trained on diverse corpora are known to work well on previously seen entities, however, updating these models with dynamically changing entities such as place names, song titles and shopping items requires re-training from scratch and collecting full sentences containing these entities.

Language Modelling Sentence

Neural Composition: Learning to Generate from Multiple Models

no code implementations10 Jul 2020 Denis Filimonov, Ravi Teja Gadde, Ariya Rastrow

Decomposing models into multiple components is critically important in many applications such as language modeling (LM) as it enables adapting individual components separately and biasing of some components to the user's personal preferences.

Language Modelling

Jasper: An End-to-End Convolutional Neural Acoustic Model

10 code implementations5 Apr 2019 Jason Li, Vitaly Lavrukhin, Boris Ginsburg, Ryan Leary, Oleksii Kuchaiev, Jonathan M. Cohen, Huyen Nguyen, Ravi Teja Gadde

In this paper, we report state-of-the-art results on LibriSpeech among end-to-end speech recognition models without any external training data.

Language Modelling Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.