Search Results for author: Apoorv Vyas

Found 11 papers, 6 papers with code

Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning

no code implementations10 Jun 2024 Chung-Ming Chien, Andros Tjandra, Apoorv Vyas, Matt Le, Bowen Shi, Wei-Ning Hsu

As the scale of generative models continues to grow, efficient reuse and adaptation of pre-trained models have become crucial considerations.

Audiobox: Unified Audio Generation with Natural Language Prompts

no code implementations25 Dec 2023 Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data.

AudioCaps Audio Generation +1

Generative Pre-training for Speech with Flow Matching

no code implementations25 Oct 2023 Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate high-fidelity synthetic data.

Speech Enhancement Speech Synthesis +1

On-demand compute reduction with stochastic wav2vec 2.0

no code implementations25 Apr 2022 Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Our results for models pre-trained on 960h Librispeech dataset and fine-tuned on 10h of transcribed data show that using the same stochastic model, we get a smooth trade-off between word error rate (WER) and inference time with only marginal WER degradation compared to the W2V2 and SEW models trained for a specific setting.

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

2 code implementations28 Dec 2020 Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model.

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models

1 code implementation7 Oct 2020 Srikanth Madikeri, Sibo Tong, Juan Zuluaga-Gomez, Apoorv Vyas, Petr Motlicek, Hervé Bourlard

We present a simple wrapper that is useful to train acoustic models in PyTorch using Kaldi's LF-MMI training framework.

Audio and Speech Processing Sound

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

7 code implementations ICML 2020 Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret

Transformers achieve remarkable performance in several tasks but due to their quadratic complexity, with respect to the input's length, they are prohibitively slow for very long sequences.

D4RL Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.