Browse > Speech > Text-To-Speech Synthesis

Text-To-Speech Synthesis

7 papers with code ยท Speech

State-of-the-art leaderboards

Latest papers without code

Evaluating Long-form Text-to-Speech: Comparing the Ratings of Sentences and Paragraphs

9 Sep 2019

We compare the results obtained from evaluating sentences in isolation, evaluating whole paragraphs of speech, and presenting a selection of speech or text as context and evaluating the subsequent speech.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

From Speech Chain to Multimodal Chain: Leveraging Cross-modal Data Augmentation for Semi-supervised Learning

3 Jun 2019

ASR, TTS, IC, and IR components can be trained in a semi-supervised fashion by assisting each other given incomplete datasets and leveraging cross-modal data augmentation within the chain.

DATA AUGMENTATION IMAGE CAPTIONING IMAGE RETRIEVAL SPEECH RECOGNITION SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

Neural Text Normalization with Subword Units

NAACL 2019

We find subword models with additional linguistic features yield the best performance (with a word error rate of 0. 17{\%}).

MACHINE TRANSLATION SPEECH RECOGNITION SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data

NAACL 2019

Neural text-to-speech synthesis (NTTS) models have shown significant progress in generating high-quality speech, however they require a large quantity of training data.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS WORD EMBEDDINGS

FastSpeech: Fast, Robust and Controllable Text to Speech

22 May 2019

Experiments on the LJSpeech dataset show that our parallel model matches autoregressive models in terms of speech quality, nearly eliminates the problem of word skipping and repeating in particularly hard cases, and can adjust voice speed smoothly.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

Direct speech-to-speech translation with a sequence-to-sequence model

12 Apr 2019

We present an attention-based sequence-to-sequence neural network which can directly translate speech from one language into speech in another language, without relying on an intermediate text representation.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

Token-Level Ensemble Distillation for Grapheme-to-Phoneme Conversion

6 Apr 2019

Recently, G2P conversion is viewed as a sequence to sequence task and modeled by RNN or CNN based encoder-decoder framework.

SPEECH RECOGNITION TEXT-TO-SPEECH SYNTHESIS

In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data

NAACL 2019

Neural text-to-speech synthesis (NTTS) models have shown significant progress in generating high-quality speech, however they require a large quantity of training data.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS WORD EMBEDDINGS

Speech denoising by parametric resynthesis

2 Apr 2019

In comparison to two denoising systems, the oracle Wiener mask and a DNN-based mask predictor, our model equals the oracle Wiener mask in subjective quality and intelligibility and surpasses the realistic system.

DENOISING SPEECH ENHANCEMENT SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis

14 Mar 2019

The results show that the newly proposed GANs achieve synthesis quality comparable to that of widely-used DNNs, without using an additive noise component.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS