Search Results for author: Kexin Zhao

Found 6 papers, 3 papers with code

DiffWave: A Versatile Diffusion Model for Audio Synthesis

11 code implementations ICLR 2021 Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional and unconditional waveform generation.

Audio Synthesis Speech Synthesis

Parallel Neural Text-to-Speech

no code implementations ICLR 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we first propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

WaveFlow: A Compact Flow-based Model for Raw Audio

4 code implementations ICML 2020 Wei Ping, Kainan Peng, Kexin Zhao, Zhao Song

WaveFlow provides a unified view of likelihood-based models for 1-D data, including WaveNet and WaveGlow as special cases.

Speech Synthesis

Multi-Speaker End-to-End Speech Synthesis

no code implementations9 Jul 2019 Jihyun Park, Kexin Zhao, Kainan Peng, Wei Ping

In this work, we extend ClariNet (Ping et al., 2019), a fully end-to-end speech synthesis model (i. e., text-to-wave), to generate high-fidelity speech from multiple speakers.

Speech Synthesis

Non-Autoregressive Neural Text-to-Speech

2 code implementations ICML 2020 Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

In this work, we propose ParaNet, a non-autoregressive seq2seq model that converts text to spectrogram.

Text-To-Speech Synthesis

Trace norm regularization and faster inference for embedded speech recognition RNNs

no code implementations ICLR 2018 Markus Kliegl, Siddharth Goyal, Kexin Zhao, Kavya Srinet, Mohammad Shoeybi

We propose and evaluate new techniques for compressing and speeding up dense matrix multiplications as found in the fully connected and recurrent layers of neural networks for embedded large vocabulary continuous speech recognition (LVCSR).

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.