no code implementations • 28 Feb 2024 • Chang-Bin Jeon, Gordon Wichern, François G. Germain, Jonathan Le Roux
In music source separation, a standard training data augmentation procedure is to create new training samples by randomly combining instrument stems from different songs.
1 code implementation • 27 Feb 2024 • Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux
Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to a finite impulse response (FIR) filter.
no code implementations • 16 Oct 2023 • Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux
The introduction of audio latent diffusion models possessing the ability to generate realistic sound clips on demand from a text description has the potential to revolutionize how we work with audio.
no code implementations • 4 Apr 2023 • Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux
In this paper, we propose a self-supervised learning framework for music source separation inspired by the HuBERT speech representation model.
no code implementations • 4 Nov 2022 • Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux
Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals.
no code implementations • 2 Nov 2022 • Zexu Pan, Gordon Wichern, François G. Germain, Aswin Subramanian, Jonathan Le Roux
Speaker diarization is well studied for constrained audios but little explored for challenging in-the-wild videos, which have more speakers, shorter utterances, and inconsistent on-screen speakers.