no code implementations • 12 May 2025 • BoWen Zhang, Congchao Guo, Geng Yang, Hang Yu, Haozhe Zhang, Heidi Lei, Jialong Mai, Junjie Yan, Kaiyue Yang, Mingqi Yang, Peikai Huang, Ruiyang Jin, Sitan Jiang, Weihua Cheng, Yawei Li, Yichen Xiao, Yiying Zhou, Yongmao Zhang, Yuan Lu, Yucen He
We introduce MiniMax-Speech, an autoregressive Transformer-based Text-to-Speech (TTS) model that generates high-quality speech.
1 code implementation • 6 Jan 2024 • Jiaqing Zhang, Jie Lei, Weiying Xie, Geng Yang, Daixun Li, Yunsong Li
Additionally, the information distribution flow (IDF) in MIVit enhances performance-awareness by distributing global classification information across different modalities' feature maps.
no code implementations • 17 Sep 2021 • Xiuqiang He, Hua Geng, Geng Yang
It is deemed that a DEM can be used to represent the whole WF to evaluate its impact on the SSS of power systems, as long as the frequency response of the DEM adequately matches that of the detailed WF model around the frequency of oscillation modes of concern.
1 code implementation • 12 Aug 2020 • Haohe Liu, Lei Xie, Jian Wu, Geng Yang
We aim to address the major issues in CNN-based high-resolution MSS model: high computational cost and weight sharing between distinctly different bands.
Audio and Speech Processing Sound
9 code implementations • Interspeech2020 2020 • Geng Yang, Shan Yang, Kai Liu, Peng Fang, Wei Chen, Lei Xie
In this paper, we propose multi-band MelGAN, a much faster waveform generation model targeting to high-quality text-to-speech.
Sound Audio and Speech Processing