Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis

In this work, we propose "global style tokens" (GSTs), a bank of embeddings that are jointly trained within Tacotron, a state-of-the-art end-to-end speech synthesis system. The embeddings are trained with no explicit labels, yet learn to model a large range of acoustic expressiveness... (read more)

PDF Abstract ICML 2018 PDF ICML 2018 Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper