Search Results for author: Ambuj Mehrish

Found 7 papers, 6 papers with code

HyperTTS: Parameter Efficient Adaptation in Text to Speech using Hypernetworks

1 code implementation • 6 Apr 2024 • Yingting Li, Rishabh Bhardwaj, Ambuj Mehrish, Bo Cheng, Soujanya Poria

In this work, we present HyperTTS, which comprises a small learnable network, "hypernetwork", that generates parameters of the Adapter blocks, allowing us to condition Adapters on speaker representations and making them dynamic.

Domain Adaptation Speech Synthesis

Paper
Code

CM-TTS: Enhancing Real Time Text-to-Speech Synthesis Efficiency through Weighted Samplers and Consistency Models

1 code implementation • 31 Mar 2024 • Xiang Li, Fan Bu, Ambuj Mehrish, Yingting Li, Jiale Han, Bo Cheng, Soujanya Poria

The pursuit of modern models, like Diffusion Models (DMs), holds promise for achieving high-fidelity, real-time speech synthesis.

Denoising Speech Synthesis +1

Paper
Code

ADAPTERMIX: Exploring the Efficacy of Mixture of Adapters for Low-Resource TTS Adaptation

1 code implementation • 29 May 2023 • Ambuj Mehrish, Abhinav Ramesh Kashyap, Li Yingting, Navonil Majumder, Soujanya Poria

There are significant challenges for speaker adaptation in text-to-speech for languages that are not widely spoken or for speakers with accents or dialects that are not well-represented in the training data.

Speech Synthesis

Paper
Code

A Review of Deep Learning Techniques for Speech Processing

no code implementations • 30 Apr 2023 • Ambuj Mehrish, Navonil Majumder, Rishabh Bhardwaj, Rada Mihalcea, Soujanya Poria

The power of deep learning techniques has opened up new avenues for research and innovation in the field of speech processing, with far-reaching implications for a range of industries and applications.

Automatic Speech Recognition Emotion Recognition +4

Paper
Add Code

Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model

1 code implementation • 24 Apr 2023 • Deepanway Ghosal, Navonil Majumder, Ambuj Mehrish, Soujanya Poria

The immense scale of the recent large language models (LLM) allows many interesting properties, such as, instruction- and chain-of-thought-based fine-tuning, that has significantly improved zero- and few-shot performance in many natural language processing (NLP) tasks.

Ranked #4 on Audio Generation on AudioCaps

AudioCaps Audio Generation

901

Paper
Code

Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding

1 code implementation • 2 Mar 2023 • Yingting Li, Ambuj Mehrish, Shuai Zhao, Rishabh Bhardwaj, Amir Zadeh, Navonil Majumder, Rada Mihalcea, Soujanya Poria

To mitigate this issue, parameter-efficient transfer learning algorithms, such as adapters and prefix tuning, have been proposed as a way to introduce a few trainable parameters that can be plugged into large pre-trained language models such as BERT, and HuBERT.

Speech Synthesis Transfer Learning

Paper
Code

Accented Text-to-Speech Synthesis with a Conditional Variational Autoencoder

1 code implementation • 7 Nov 2022 • Jan Melechovsky, Ambuj Mehrish, Berrak Sisman, Dorien Herremans

Accent plays a significant role in speech communication, influencing understanding capabilities and also conveying a person's identity.

Speech Synthesis Text-To-Speech Synthesis

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.