no code implementations • 7 Jun 2022 • Dan Oneata, Beata Lorincz, Adriana Stan, Horia Cucu
This modularity enables the easy replacement of each of its components, while also ensuring the fast adaptation to new speaker identities by disentangling or projecting the input features.
no code implementations • 3 Jun 2021 • Beata Lorincz, Adriana Stan, Mircea Giurgiu
The visualisation of the t-SNE projections of the natural and synthesised speaker embeddings show that the acoustic model shifts some of the speakers' neural representation, but not all of them.
no code implementations • 3 Jun 2021 • Beata Lorincz, Adriana Stan, Mircea Giurgiu
Building multispeaker neural network-based text-to-speech synthesis systems commonly relies on the availability of large amounts of high quality recordings from each speaker and conditioning the training process on the speaker's identity or on a learned representation of it.