1 code implementation • 26 Jul 2021 • Csaba Zainkó, László Tóth, Amin Honarmandi Shandiz, Gábor Gosztolya, Alexandra Markó, Géza Németh, Tamás Gábor Csapó
In this paper, we experimented with transfer learning and adaptation of a Tacotron2 text-to-speech model to improve the final synthesis quality of ultrasound-based articulatory-to-acoustic mapping with a limited database.
1 code implementation • 5 Jul 2021 • Tamás Gábor Csapó, László Tóth, Gábor Gosztolya, Alexandra Markó
Besides, we analyze the ultrasound tongue recordings of several speakers, and show that misalignments in the ultrasound transducer positioning can have a negative effect on the final synthesis performance.
1 code implementation • 6 Aug 2020 • Tamás Gábor Csapó, Csaba Zainkó, László Tóth, Gábor Gosztolya, Alexandra Markó
The training target is the 80-dimensional mel-spectrogram, which results in a finer detailed spectral representation than the previously used 25-dimensional Mel-Generalized Cepstrum.
Audio and Speech Processing Sound
no code implementations • 24 Jun 2019 • Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó
Recently it was shown that within the Silent Speech Interface (SSI) field, the prediction of F0 is possible from Ultrasound Tongue Images (UTI) as the articulatory input, using Deep Neural Networks for articulatory-to-acoustic mapping.
Sound Audio and Speech Processing