no code implementations • LREC 2022 • Nadja Schauffler, Toni Bernhart, Andre Blessing, Gunilla Eschenbach, Markus Gärtner, Kerstin Jung, Anna Kinder, Julia Koch, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu, Lorenz Wesemann, Jonas Kuhn
We present the steps taken towards an exploration platform for a multi-modal corpus of German lyric poetry from the Romantic era developed in the project »textklang«.
1 code implementation • 26 Oct 2023 • Florian Lux, Julia Koch, Sarina Meyer, Thomas Bott, Nadja Schauffler, Pavel Denisov, Antje Schweitzer, Ngoc Thang Vu
For our contribution to the Blizzard Challenge 2023, we improved on the system we submitted to the Blizzard Challenge 2021.
1 code implementation • 21 Oct 2022 • Florian Lux, Julia Koch, Ngoc Thang Vu
While neural methods for text-to-speech (TTS) have shown great advances in modeling multiple speakers, even in zero-shot settings, the amount of data needed for those approaches is generally not feasible for the vast majority of the world's over 6, 000 spoken languages.
1 code implementation • 13 Oct 2022 • Sarina Meyer, Pascal Tilli, Pavel Denisov, Florian Lux, Julia Koch, Ngoc Thang Vu
In order to protect the privacy of speech data, speaker anonymization aims for hiding the identity of a speaker by changing the voice in speech recordings.
1 code implementation • 11 Jul 2022 • Sarina Meyer, Florian Lux, Pavel Denisov, Julia Koch, Pascal Tilli, Ngoc Thang Vu
In this work, we propose a speaker anonymization pipeline that leverages high quality automatic speech recognition and synthesis systems to generate speech conditioned on phonetic transcriptions and anonymized speaker embeddings.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 11 Jul 2022 • Julia Koch, Florian Lux, Nadja Schauffler, Toni Bernhart, Felix Dieterle, Jonas Kuhn, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu
Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech.
2 code implementations • 24 Jun 2022 • Florian Lux, Julia Koch, Ngoc Thang Vu
The cloning of a speaker's voice using an untranscribed reference sample is one of the great advances of modern neural text-to-speech (TTS) methods.