Search Results for author: Matthew Baas

Found 8 papers, 5 papers with code

Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices

no code implementations12 Oct 2023 Matthew Baas, Herman Kamper

Nevertheless, this shows that voice conversion models - and kNN-VC in particular - are increasingly applicable in a range of non-standard downstream tasks.

Voice Conversion

Disentanglement in a GAN for Unconditional Speech Synthesis

1 code implementation4 Jul 2023 Matthew Baas, Herman Kamper

We confirm that ASGAN's latent space is disentangled: we demonstrate how simple linear operations in the space can be used to perform several tasks unseen during training.

Disentanglement Generative Adversarial Network +5

Voice Conversion With Just Nearest Neighbors

1 code implementation30 May 2023 Matthew Baas, Benjamin van Niekerk, Herman Kamper

Any-to-any voice conversion aims to transform source speech into a target voice with just a few examples of the target speaker as a reference.

 Ranked #1 on Voice Conversion on LibriSpeech test-clean (using extra training data)

Voice Conversion

TransFusion: Transcribing Speech with Multinomial Diffusion

1 code implementation14 Oct 2022 Matthew Baas, Kevin Eloff, Herman Kamper

In this work we aim to see whether the benefits of diffusion models can also be realized for speech recognition.

Denoising Image Generation +3

GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models

1 code implementation11 Oct 2022 Matthew Baas, Herman Kamper

As in the StyleGAN family of image synthesis models, ASGAN maps sampled noise to a disentangled latent vector which is then mapped to a sequence of audio features so that signal aliasing is suppressed at every layer.

Disentanglement Generative Adversarial Network +2

Voice Conversion Can Improve ASR in Very Low-Resource Settings

no code implementations4 Nov 2021 Matthew Baas, Herman Kamper

In this work we assess whether a VC system can be used cross-lingually to improve low-resource speech recognition.

Data Augmentation speech-recognition +2

StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts

no code implementations31 May 2021 Matthew Baas, Herman Kamper

We specifically extend the recent StarGAN-VC model by conditioning it on a speaker embedding (from a potentially unseen speaker).

Voice Conversion

Cannot find the paper you are looking for? You can Submit a new open access paper.