Search Results for author: François G. Germain

Found 6 papers, 1 papers with code

Why does music source separation benefit from cacophony?

no code implementations28 Feb 2024 Chang-Bin Jeon, Gordon Wichern, François G. Germain, Jonathan Le Roux

In music source separation, a standard training data augmentation procedure is to create new training samples by randomly combining instrument stems from different songs.

Data Augmentation Music Source Separation

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

1 code implementation27 Feb 2024 Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to a finite impulse response (FIR) filter.

Spatial Interpolation

Generation or Replication: Auscultating Audio Latent Diffusion Models

no code implementations16 Oct 2023 Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

The introduction of audio latent diffusion models possessing the ability to generate realistic sound clips on demand from a text description has the potential to revolutionize how we work with audio.

AudioCaps Memorization +1

Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT

no code implementations4 Apr 2023 Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux

In this paper, we propose a self-supervised learning framework for music source separation inspired by the HuBERT speech representation model.

Clustering Music Source Separation +1

Cold Diffusion for Speech Enhancement

no code implementations4 Nov 2022 Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux

Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals.

Speech Enhancement

Late Audio-Visual Fusion for In-The-Wild Speaker Diarization

no code implementations2 Nov 2022 Zexu Pan, Gordon Wichern, François G. Germain, Aswin Subramanian, Jonathan Le Roux

Speaker diarization is well studied for constrained audios but little explored for challenging in-the-wild videos, which have more speakers, shorter utterances, and inconsistent on-screen speakers.

speaker-diarization Speaker Diarization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.