Browse > Speech > Speech Synthesis

Speech Synthesis

36 papers with code · Speech

Speech synthesis is the task of generating speech from text.

Please note that the leaderboards here are not really comparable between studies - as they use mean opinion score as a metric and collect different samples from Amazon Mechnical Turk.

( Image credit: WaveNet: A generative model for raw audio )

Leaderboards

Latest papers with code

A Resource for Computational Experiments on Mapudungun

4 Dec 2019pywirrarika/naki

We present a resource for computational experiments on Mapudungun, a polysynthetic indigenous language spoken in Chile with upwards of 200 thousand speakers.

MACHINE TRANSLATION SPEECH RECOGNITION SPEECH SYNTHESIS

35
04 Dec 2019

Jejueo Datasets for Machine Translation and Speech Synthesis

27 Nov 2019kakaobrain/jejueo

Jejueo was classified as critically endangered by UNESCO in 2010.

MACHINE TRANSLATION SPEECH SYNTHESIS

50
27 Nov 2019

Parallel WaveGAN: A fast waveform generation model based on generative adversarial networks with multi-resolution spectrogram

25 Oct 2019kan-bayashi/ParallelWaveGAN

We propose Parallel WaveGAN, a distillation-free, fast, and small-footprint waveform generation method using a generative adversarial network.

SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

166
25 Oct 2019

MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis

NeurIPS 2019 descriptinc/melgan-neurips

In this paper, we show that it is possible to train GANs reliably to generate high quality coherent waveforms by introducing a set of architectural changes and simple training techniques.

SPEECH SYNTHESIS

282
08 Oct 2019

High Fidelity Speech Synthesis with Adversarial Networks

25 Sep 2019yanggeng1995/GAN-TTS

However, their application in the audio domain has received limited attention, and autoregressive models, such as WaveNet, remain the state of the art in generative modelling of audio signals such as human speech.

SPEECH SYNTHESIS

78
25 Sep 2019

DurIAN: Duration Informed Attention Network For Multimodal Synthesis

4 Sep 2019yanggeng1995/subband_WaveRNN

In this paper, we present a generic and robust multimodal synthesis system that produces highly natural speech and facial expression simultaneously.

SPEECH SYNTHESIS

18
04 Sep 2019

Unpaired Image-to-Speech Synthesis with Multimodal Information Bottleneck

ICCV 2019 yunyikristy/skipNet

We propose a multimodal information bottleneck approach that learns the correspondence between modalities from unpaired data (image and speech) by leveraging the shared modality (text).

IMAGE GENERATION SPEECH SYNTHESIS

1
19 Aug 2019

Deep Residual Neural Networks for Audio Spoofing Detection

30 Jun 2019nesl/asvspoof2019

Additionally, replay attacks where the attacker uses a speaker to replay a previously recorded genuine human speech are also possible.

SPEAKER VERIFICATION SPEECH SYNTHESIS VOICE CONVERSION

18
30 Jun 2019

Using generative modelling to produce varied intonation for speech synthesis

10 Jun 2019ZackHodari/average_prosody

A generative model that can synthesise multiple prosodies will, by design, not model average prosody.

SPEECH SYNTHESIS

16
10 Jun 2019