Search Results for author: Matthew Baas

Found 8 papers, 5 papers with code

Voice Conversion for Stuttered Speech, Instruments, Unseen Languages and Textually Described Voices

no code implementations • 12 Oct 2023 • Matthew Baas, Herman Kamper

Nevertheless, this shows that voice conversion models - and kNN-VC in particular - are increasingly applicable in a range of non-standard downstream tasks.

Voice Conversion

Paper
Add Code

Disentanglement in a GAN for Unconditional Speech Synthesis

1 code implementation • 4 Jul 2023 • Matthew Baas, Herman Kamper

We confirm that ASGAN's latent space is disentangled: we demonstrate how simple linear operations in the space can be used to perform several tasks unseen during training.

Disentanglement Generative Adversarial Network +5

Paper
Code

Voice Conversion With Just Nearest Neighbors

1 code implementation • 30 May 2023 • Matthew Baas, Benjamin van Niekerk, Herman Kamper

Any-to-any voice conversion aims to transform source speech into a target voice with just a few examples of the target speaker as a reference.

Ranked #1 on Voice Conversion on LibriSpeech test-clean (using extra training data)

Voice Conversion

410

Paper
Code

TransFusion: Transcribing Speech with Multinomial Diffusion

1 code implementation • 14 Oct 2022 • Matthew Baas, Kevin Eloff, Herman Kamper

In this work we aim to see whether the benefits of diffusion models can also be realized for speech recognition.

Denoising Image Generation +3

Paper
Code

GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models

1 code implementation • 11 Oct 2022 • Matthew Baas, Herman Kamper

As in the StyleGAN family of image synthesis models, ASGAN maps sampled noise to a disentangled latent vector which is then mapped to a sequence of audio features so that signal aliasing is suppressed at every layer.

Disentanglement Generative Adversarial Network +2

Paper
Code

Voice Conversion Can Improve ASR in Very Low-Resource Settings

no code implementations • 4 Nov 2021 • Matthew Baas, Herman Kamper

In this work we assess whether a VC system can be used cross-lingually to improve low-resource speech recognition.

Data Augmentation speech-recognition +2

Paper
Add Code

Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing

1 code implementation • 2 Aug 2021 • Benjamin van Niekerk, Leanne Nortje, Matthew Baas, Herman Kamper

In this paper, we first show that the per-utterance mean of CPC features captures speaker information to a large extent.

Acoustic Unit Discovery Language Modelling +1

Paper
Code

StarGAN-ZSVC: Towards Zero-Shot Voice Conversion in Low-Resource Contexts

no code implementations • 31 May 2021 • Matthew Baas, Herman Kamper

We specifically extend the recent StarGAN-VC model by conditioning it on a speaker embedding (from a potentially unseen speaker).

Voice Conversion

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.