Search Results for author: Ruiqing Xue

Found 1 papers, 0 papers with code

FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model

no code implementations • 6 Mar 2023 • Ruiqing Xue, Yanqing Liu, Lei He, Xu Tan, Linquan Liu, Edward Lin, Sheng Zhao

Neural text-to-speech (TTS) generally consists of cascaded architecture with separately optimized acoustic model and vocoder, or end-to-end architecture with continuous mel-spectrograms or self-extracted speech frames as the intermediate representations to bridge acoustic model and vocoder, which suffers from two limitations: 1) the continuous acoustic frames are hard to predict with phoneme only, and acoustic information like duration or pitch is also needed to solve the one-to-many problem, which is not easy to scale on large scale and noise datasets; 2) to achieve diverse speech output based on continuous speech features, complex VAE or flow-based models are usually required.

Language Modelling Large Language Model +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.