HUI-Audio-Corpus-German: A high quality TTS dataset

11 Jun 2021  ·  Pascal Puchtler, Johannes Wirth, René Peinl ·

The increasing availability of audio data on the internet lead to a multitude of datasets for development and training of text to speech applications, based on neural networks. Highly differing quality of voice, low sampling rates, lack of text normalization and disadvantageous alignment of audio samples to corresponding transcript sentences still limit the performance of deep neural networks trained on this task. Additionally, data resources in languages like German are still very limited. We introduce the "HUI-Audio-Corpus-German", a large, open-source dataset for TTS engines, created with a processing pipeline, which produces high quality audio to transcription alignments and decreases manual effort needed for creation.

PDF Abstract

Datasets


Introduced in the Paper:

HUI speech corpus

Used in the Paper:

LibriSpeech LibriTTS

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here