no code implementations • CAI (COLING) 2022 • Zhuo Gong, Daisuke Saito, Sheng Li, Hisashi Kawai, Nobuaki Minematsu
The experiments show that we can enhance an ASR E2E model based on encoder-decoder architecture by pre-training the decoder with text data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
1 code implementation • 8 Apr 2022 • Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi
Low-resource speech recognition has been long-suffering from insufficient training data.
no code implementations • 31 Jul 2018 • Yi Zhao, Shinji Takaki, Hieu-Thi Luong, Junichi Yamagishi, Daisuke Saito, Nobuaki Minematsu
In order to reduce the mismatched characteristics between natural and generated acoustic features, we propose frameworks that incorporate either a conditional generative adversarial network (GAN) or its variant, Wasserstein GAN with gradient penalty (WGAN-GP), into multi-speaker speech synthesis that uses the WaveNet vocoder.