Search Results for author: Apoorv Vyas

Found 10 papers, 6 papers with code

Audiobox: Unified Audio Generation with Natural Language Prompts

no code implementations • 25 Dec 2023 • Apoorv Vyas, Bowen Shi, Matthew Le, Andros Tjandra, Yi-Chiao Wu, Baishan Guo, Jiemin Zhang, Xinyue Zhang, Robert Adkins, William Ngan, Jeff Wang, Ivan Cruz, Bapi Akula, Akinniyi Akinyemi, Brian Ellis, Rashel Moritz, Yael Yungster, Alice Rakotoarison, Liang Tan, Chris Summers, Carleigh Wood, Joshua Lane, Mary Williamson, Wei-Ning Hsu

Research communities have made great progress over the past year advancing the performance of large scale audio generative models for a single modality (speech, sound, or music) through adopting more powerful generative models and scaling data.

Ranked #1 on Audio Generation on AudioCaps

AudioCaps Audio Generation +1

Paper
Add Code

Generative Pre-training for Speech with Flow Matching

no code implementations • 25 Oct 2023 • Alexander H. Liu, Matt Le, Apoorv Vyas, Bowen Shi, Andros Tjandra, Wei-Ning Hsu

Generative models have gained more and more attention in recent years for their remarkable success in tasks that required estimating and sampling data distribution to generate high-fidelity synthetic data.

Speech Enhancement Speech Synthesis +1

Paper
Add Code

Scaling Speech Technology to 1,000+ Languages

3 code implementations • arXiv 2023 • Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Expanding the language coverage of speech technology has the potential to improve access to information for many more people.

Automatic Speech Recognition Language Identification +4

29,233

Paper
Code

On-demand compute reduction with stochastic wav2vec 2.0

no code implementations • 25 Apr 2022 • Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Our results for models pre-trained on 960h Librispeech dataset and fine-tuned on 10h of transcribed data show that using the same stochastic model, we get a smooth trade-off between word error rate (WER) and inference time with only marginal WER degradation compared to the W2V2 and SEW models trained for a specific setting.

Paper
Add Code

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

no code implementations • 6 Apr 2021 • Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard

On Switchboard (300h) we obtain relative improvements of 33% and 35% respectively.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Lattice-Free MMI Adaptation Of Self-Supervised Pretrained Acoustic Models

2 code implementations • 28 Dec 2020 • Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard

In this work, we propose lattice-free MMI (LFMMI) for supervised adaptation of self-supervised pretrained acoustic model.

Paper
Code

Pkwrap: a PyTorch Package for LF-MMI Training of Acoustic Models

1 code implementation • 7 Oct 2020 • Srikanth Madikeri, Sibo Tong, Juan Zuluaga-Gomez, Apoorv Vyas, Petr Motlicek, Hervé Bourlard

We present a simple wrapper that is useful to train acoustic models in PyTorch using Kaldi's LF-MMI training framework.

Audio and Speech Processing Sound

Paper
Code

Fast Transformers with Clustered Attention

1 code implementation • NeurIPS 2020 • Apoorv Vyas, Angelos Katharopoulos, François Fleuret

This results in a model with linear complexity with respect to the sequence length for a fixed number of clusters.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

1,570

Paper
Code

Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention

6 code implementations • ICML 2020 • Angelos Katharopoulos, Apoorv Vyas, Nikolaos Pappas, François Fleuret

Transformers achieve remarkable performance in several tasks but due to their quadratic complexity, with respect to the input's length, they are prohibitively slow for very long sequences.

Ranked #5 on Offline RL on D4RL

D4RL Language Modelling +1

1,570

Paper
Code

Out-of-Distribution Detection Using an Ensemble of Self Supervised Leave-out Classifiers

1 code implementation • ECCV 2018 • Apoorv Vyas, Nataraj Jammalamadaka, Xia Zhu, Dipankar Das, Bharat Kaul, Theodore L. Willke

In conjunction with the standard cross-entropy loss, we minimize the novel loss to train an ensemble of classifiers.

Autonomous Driving Out-of-Distribution Detection +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.