no code implementations • CCL 2020 • Guangyi Wang, Feilong Bao, Weihua Wang
This paper proposes a classification model, which combines the Bi-LSTM model with the Multi-Head Attention mechanism.
1 code implementation • 10 Mar 2024 • Qiuyu Liang, Weihua Wang, Feilong Bao, Guanglai Gao
Specifically, we map the learned features of graph nodes into hyperbolic space, and then perform a Lorentzian linear feature transformation to capture the underlying tree-like structure of data.
1 code implementation • 11 Dec 2022 • Kailin Liang, Bin Liu, Yifan Hu, Rui Liu, Feilong Bao, Guanglai Gao
Text-to-Speech (TTS) synthesis for low-resource languages is an attractive research issue in academia and industry nowadays.
1 code implementation • 22 Sep 2022 • Yifan Hu, Pengkai Yin, Rui Liu, Feilong Bao, Guanglai Gao
This paper introduces a high-quality open-source text-to-speech (TTS) synthesis dataset for Mongolian, a low-resource language spoken by over 10 million people worldwide.
no code implementations • COLING 2020 • Na Liu, Xiangdong Su, Haoran Zhang, Guanglai Gao, Feilong Bao
The inner-word encoder uses the self-attention mechanisms to capture the inner-word features of the target word.
no code implementations • 11 Aug 2020 • Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li
We propose a multi-task learning scheme for Tacotron training, that optimizes the system to predict both Mel spectrum and phrase breaks.
no code implementations • 2 Feb 2020 • Rui Liu, Berrak Sisman, Feilong Bao, Guanglai Gao, Haizhou Li
To address this problem, we propose a new training scheme for Tacotron-based TTS, referred to as WaveTTS, that has 2 loss functions: 1) time-domain loss, denoted as the waveform loss, that measures the distortion between the natural and generated waveform; and 2) frequency-domain loss, that measures the Mel-scale acoustic feature loss between the natural and generated acoustic features.
no code implementations • 7 Nov 2019 • Rui Liu, Berrak Sisman, Jingdong Li, Feilong Bao, Guanglai Gao, Haizhou Li
We first train a Tacotron2-based TTS model by always providing natural speech frames to the decoder, that serves as a teacher model.
no code implementations • COLING 2018 • Rui Liu, Feilong Bao, Guanglai Gao, HUI ZHANG, Yonghe Wang
In this paper, we first utilize the word embedding that focuses on sub-word units to the Mongolian Phrase Break (PB) prediction task by using Long-Short-Term-Memory (LSTM) model.
no code implementations • COLING 2016 • Weihua Wang, Feilong Bao, Guanglai Gao
The system based on segmenting suffixes with all proposed features yields benchmark result of F-measure=84. 65 on this corpus.