no code implementations • 10 Oct 2024 • Ethan He, Abhinav Khattar, Ryan Prenger, Vijay Korthikanti, Zijie Yan, Tong Liu, Shiqing Fan, Ashwath Aithal, Mohammad Shoeybi, Bryan Catanzaro
Upcycling pre-trained dense language models into sparse mixture-of-experts (MoE) models is an efficient approach to increase the model capacity of already trained models.
3 code implementations • NeurIPS 2023 • Kaiyu Yang, Aidan M. Swope, Alex Gu, Rahul Chalamala, Peiyang Song, Shixing Yu, Saad Godil, Ryan Prenger, Anima Anandkumar
Using this data, we develop ReProver (Retrieval-Augmented Prover): an LLM-based prover augmented with retrieval for selecting premises from a vast math library.
no code implementations • 12 Oct 2022 • Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro
Prior work on closed-book QA either directly finetunes or prompts a pretrained language model (LM) to leverage the stored knowledge.
1 code implementation • Findings (ACL) 2022 • Zihan Liu, Mostofa Patwary, Ryan Prenger, Shrimai Prabhumoye, Wei Ping, Mohammad Shoeybi, Bryan Catanzaro
We propose a multi-stage prompting approach to generate knowledgeable responses from a single pretrained LM.
3 code implementations • ICLR 2021 • Rafael Valle, Kevin Shih, Ryan Prenger, Bryan Catanzaro
In this paper we propose Flowtron: an autoregressive flow-based generative network for text-to-speech synthesis with control over speech variation and style transfer.
Ranked #1 on
Text-To-Speech Synthesis
on LJSpeech
(Pleasantness MOS metric, using extra
training data)
4 code implementations • 26 Oct 2019 • Rafael Valle, Jason Li, Ryan Prenger, Bryan Catanzaro
Mellotron is a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data.
2 code implementations • 31 Oct 2018 • Ryan Prenger, Rafael Valle, Bryan Catanzaro
In this paper we propose WaveGlow: a flow-based network capable of generating high quality speech from mel-spectrograms.
Ranked #12 on
Speech Synthesis
on LibriTTS
no code implementations • 15 Mar 2017 • Sercan O. Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Chris Fougner, Ryan Prenger, Adam Coates
Keyword spotting (KWS) constitutes a major component of human-technology interfaces.
36 code implementations • 8 Dec 2015 • Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Erich Elsen, Jesse Engel, Linxi Fan, Christopher Fougner, Tony Han, Awni Hannun, Billy Jun, Patrick LeGresley, Libby Lin, Sharan Narang, Andrew Ng, Sherjil Ozair, Ryan Prenger, Jonathan Raiman, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Yi Wang, Zhiqian Wang, Chong Wang, Bo Xiao, Dani Yogatama, Jun Zhan, Zhenyao Zhu
We show that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech--two vastly different languages.
24 code implementations • 17 Dec 2014 • Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng
We present a state-of-the-art speech recognition system developed using end-to-end deep learning.