Search Results for author: François G. Germain

Found 6 papers, 1 papers with code

Why does music source separation benefit from cacophony?

no code implementations • 28 Feb 2024 • Chang-Bin Jeon, Gordon Wichern, François G. Germain, Jonathan Le Roux

In music source separation, a standard training data augmentation procedure is to create new training samples by randomly combining instrument stems from different songs.

Data Augmentation Music Source Separation

Paper
Add Code

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

1 code implementation • 27 Feb 2024 • Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

Existing NF-based methods focused on estimating the magnitude of the HRTF from a given sound source direction, and the magnitude is converted to a finite impulse response (FIR) filter.

Spatial Interpolation

Paper
Code

Generation or Replication: Auscultating Audio Latent Diffusion Models

no code implementations • 16 Oct 2023 • Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux

The introduction of audio latent diffusion models possessing the ability to generate realistic sound clips on demand from a text description has the potential to revolutionize how we work with audio.

AudioCaps Memorization +1

Paper
Add Code

Pac-HuBERT: Self-Supervised Music Source Separation via Primitive Auditory Clustering and Hidden-Unit BERT

no code implementations • 4 Apr 2023 • Ke Chen, Gordon Wichern, François G. Germain, Jonathan Le Roux

In this paper, we propose a self-supervised learning framework for music source separation inspired by the HuBERT speech representation model.

Clustering Music Source Separation +1

Paper
Add Code

Cold Diffusion for Speech Enhancement

no code implementations • 4 Nov 2022 • Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux

Diffusion models have recently shown promising results for difficult enhancement tasks such as the conditional and unconditional restoration of natural images and audio signals.

Speech Enhancement

Paper
Add Code

Late Audio-Visual Fusion for In-The-Wild Speaker Diarization

no code implementations • 2 Nov 2022 • Zexu Pan, Gordon Wichern, François G. Germain, Aswin Subramanian, Jonathan Le Roux

Speaker diarization is well studied for constrained audios but little explored for challenging in-the-wild videos, which have more speakers, shorter utterances, and inconsistent on-screen speakers.

speaker-diarization Speaker Diarization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.