Browse > Speech > Voice Conversion

Voice Conversion

18 papers with code · Speech

Leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks

23 Sep 2017r9y9/gantts

In the proposed framework incorporating the GANs, the discriminator is trained to distinguish natural and generated speech parameters, while the acoustic models are trained to minimize the weighted sum of the conventional minimum generation loss and an adversarial loss for deceiving the discriminator.

SPEECH SYNTHESIS VOICE CONVERSION

AdaGAN: Adaptive GAN for Many-to-Many Non-Parallel Voice Conversion

ICLR 2020 liusongxiang/StarGAN-Voice-Conversion

In this paper, we propose a novel style transfer architecture, which can also be extended to generate voices even for target speakers whose data were not used in the training (i. e., case of zero-shot learning).

STYLE TRANSFER VOICE CONVERSION ZERO-SHOT LEARNING

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

NeurIPS 2019 liusongxiang/StarGAN-Voice-Conversion

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

AUDIO GENERATION VOICE CONVERSION

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

14 May 2019liusongxiang/StarGAN-Voice-Conversion

On the other hand, CVAE training is simple but does not come with the distribution-matching property of a GAN.

STYLE TRANSFER VOICE CONVERSION

StarGAN-VC: Non-parallel many-to-many voice conversion with star generative adversarial networks

6 Jun 2018liusongxiang/StarGAN-Voice-Conversion

This paper proposes a method that allows non-parallel many-to-many voice conversion (VC) by using a variant of a generative adversarial network (GAN) called StarGAN.

VOICE CONVERSION

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations

9 Apr 2018jjery2243542/voice_conversion

The decoder then takes the speaker-independent latent representation and the target speaker embedding as the input to generate the voice of the target speaker with the linguistic content of the source utterance.

VOICE CONVERSION

One-shot Voice Conversion by Separating Speaker and Content Representations with Instance Normalization

10 Apr 2019jjery2243542/adaptive_voice_conversion

Recently, voice conversion (VC) without parallel data has been successfully adapted to multi-target scenario in which a single model is trained to convert the input voice to many different speakers.

VOICE CONVERSION

Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks

4 Apr 2017JeremyCCHsu/vae-npvc

Building a voice conversion (VC) system from non-parallel speech corpora is challenging but highly valuable in real application scenarios.

VOICE CONVERSION

Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder

13 Oct 2016JeremyCCHsu/vae-npvc

We propose a flexible framework for spectral conversion (SC) that facilitates training with unaligned corpora.

VOICE CONVERSION

Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion

28 May 2019andi611/ZeroSpeech-TTS-without-T

We found that the proposed encoding method offers automatic extraction of speech content from speaker style, and is sufficient to cover full linguistic content in a given language.

VOICE CONVERSION