no code implementations • 29 Oct 2020 • Yanpei Shi, Mingjie Chen, Qiang Huang, Thomas Hain
The use of memory mechanism could reach 10. 6% and 7. 7% relative improvement compared with not using memory mechanism.
1 code implementation • 22 Oct 2020 • Mingjie Chen, Yanpei Shi, Thomas Hain
In this work, we aim at improving the data efficiency of the model and achieving a many-to-many non-parallel StarGAN-based voice conversion for a relatively large number of speakers with limited training samples.
Sound Audio and Speech Processing
no code implementations • 15 May 2020 • Yanpei Shi, Qiang Huang, Thomas Hain
To evaluate the effectiveness of the proposed approach, artificial datasets based on Switchboard Cellular part1 (SWBC) and Voxceleb1 are constructed in two conditions, where speakers' voices are overlapped and not overlapped.
no code implementations • 15 May 2020 • Yanpei Shi, Qiang Huang, Thomas Hain
The obtained results show that the proposed approach using speaker dependent speech enhancement can yield better speaker recognition and speech enhancement performances than two baselines in various noise conditions.
no code implementations • 14 Jan 2020 • Yanpei Shi, Qiang Huang, Thomas Hain
Instead of individually processing speech enhancement and speaker recognition, the two modules are integrated into one framework by a joint optimisation using deep neural networks.
no code implementations • 14 Jan 2020 • Yanpei Shi, Thomas Hain
The proposed approach separates different speaker properties from a two-speaker signal in embedding space.
no code implementations • 17 Oct 2019 • Yanpei Shi, Qiang Huang, Thomas Hain
In the proposed approach, frame-level encoder and attention are applied on segments of an input utterance and generate individual segment vectors.
no code implementations • 16 Oct 2019 • Yanpei Shi, Thomas Hain
To evaluate the effectiveness of our approaches compared to prior work, two tasks are conducted -- phone classification and speaker recognition -- and test on different TIMIT data sets.
no code implementations • 24 Sep 2019 • Yanpei Shi, Qiang Huang, Thomas Hain
While the use of deep neural networks has significantly boosted speaker recognition performance, it is still challenging to separate speakers in poor acoustic environments.