Search Results for author: Thilo Koehler

Found 4 papers, 0 papers with code

Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis

no code implementations • 19 Jan 2024 • Prabhav Agrawal, Thilo Koehler, Zhiping Xiu, Prashant Serai, Qing He

A DSP vocoder often gets a lower audio quality due to consuming over-smoothed acoustic model predictions of approximate representations for the vocal tract.

Speech Synthesis

Paper
Add Code

Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

no code implementations • 1 Apr 2021 • Qing He, Zhiping Xiu, Thilo Koehler, JiLong Wu

Typical high quality text-to-speech (TTS) systems today use a two-stage architecture, with a spectrum model stage that generates spectral frames and a vocoder stage that generates the actual audio.

Paper
Add Code

FBWave: Efficient and Scalable Neural Vocoders for Streaming Text-To-Speech on the Edge

no code implementations • 25 Nov 2020 • Bichen Wu, Qing He, Peizhao Zhang, Thilo Koehler, Kurt Keutzer, Peter Vajda

More efficient variants of FBWave can achieve up to 109x fewer MACs while still delivering acceptable audio quality.

Paper
Add Code

G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR

no code implementations • 22 Oct 2019 • Duc Le, Thilo Koehler, Christian Fuegen, Michael L. Seltzer

Grapheme-based acoustic modeling has recently been shown to outperform phoneme-based approaches in both hybrid and end-to-end automatic speech recognition (ASR), even on non-phonemic languages like English.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.