no code implementations • 28 Feb 2023 • Jocelyn Huang, Evelina Bakhturina, Oktai Tatanov
Grapheme-to-phoneme (G2P) transduction is part of the standard text-to-speech (TTS) pipeline.
no code implementations • 16 Feb 2023 • Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg
In this work, we propose a zero-shot voice conversion method using speech representations trained with self-supervised learning.
1 code implementation • 12 Jul 2022 • Isha Hameed, Samuel Sharpe, Daniel Barcklow, Justin Au-Yeung, Sahil Verma, Jocelyn Huang, Brian Barr, C. Bayan Bruss
By perturbing the input variables in rank order of importance, the goal is to assess the sensitivity of the model's performance.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
15 code implementations • 22 Oct 2019 • Samuel Kriman, Stanislav Beliaev, Boris Ginsburg, Jocelyn Huang, Oleksii Kuchaiev, Vitaly Lavrukhin, Ryan Leary, Jason Li, Yang Zhang
We propose a new end-to-end neural acoustic model for automatic speech recognition.
Ranked #33 on Speech Recognition on LibriSpeech test-clean
Speech Recognition Audio and Speech Processing
1 code implementation • 14 Sep 2019 • Oleksii Kuchaiev, Jason Li, Huyen Nguyen, Oleksii Hrinchuk, Ryan Leary, Boris Ginsburg, Samuel Kriman, Stanislav Beliaev, Vitaly Lavrukhin, Jack Cook, Patrice Castonguay, Mariya Popova, Jocelyn Huang, Jonathan M. Cohen
NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition.
Ranked #1 on Speech Recognition on Common Voice Spanish (using extra training data)
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1