no code implementations • 2 Feb 2024 • Marco Pasini, Maarten Grachten, Stefan Lattner
At the core of our method are audio autoencoders that efficiently compress audio waveform samples into invertible latent representations, and a conditional latent diffusion model that takes as input the latent encoding of a mix and generates the latent encoding of a corresponding stem.
1 code implementation • 18 Aug 2022 • Marco Pasini, Jan Schlüter
We release the source code and pretrained autoencoder weights at github. com/marcoppasini/musika, such that a GAN can be trained on a new music domain with a single GPU in a matter of hours.
2 code implementations • 8 Oct 2019 • Marco Pasini
We propose MelGAN-VC, a voice conversion method that relies on non-parallel speech data and is able to convert audio signals of arbitrary length from a source voice to a target voice.