1 code implementation • 26 Jul 2021 • Csaba Zainkó, László Tóth, Amin Honarmandi Shandiz, Gábor Gosztolya, Alexandra Markó, Géza Németh, Tamás Gábor Csapó
In this paper, we experimented with transfer learning and adaptation of a Tacotron2 text-to-speech model to improve the final synthesis quality of ultrasound-based articulatory-to-acoustic mapping with a limited database.
no code implementations • 19 Jun 2021 • Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Géza Németh
Vocoders received renewed attention as main components in statistical parametric text-to-speech (TTS) synthesis and speech transformation systems.
no code implementations • 12 Jun 2021 • Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Csaba Zainkó, Géza Németh
To date, various speech technology systems have adopted the vocoder approach, a method for synthesizing speech waveform that shows a major role in the performance of statistical parametric speech synthesis.
no code implementations • RANLP 2019 • Sevinj Yolchuyeva, Géza Németh, Bálint Gyires-Tóth
Self-attention networks (SAN) have shown promising performance in various Natural Language Processing (NLP) scenarios, especially in machine translation.
1 code implementation • arXiv preprint 2019 • Sevinj Yolchuyeva, Géza Németh, Bálint Gyires-Tóth
The transformer network architecture is completely based on attention mechanisms, and it outperforms sequence-to-sequence models in neural machine translation without recurrent and convolutional layers.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 24 Jun 2019 • Tamás Gábor Csapó, Mohammed Salah Al-Radhi, Géza Németh, Gábor Gosztolya, Tamás Grósz, László Tóth, Alexandra Markó
Recently it was shown that within the Silent Speech Interface (SSI) field, the prediction of F0 is possible from Ultrasound Tongue Images (UTI) as the articulatory input, using Deep Neural Networks for articulatory-to-acoustic mapping.
Sound Audio and Speech Processing