Audio Source Separation

Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).

Most implemented papers

Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation

f90/Wave-U-Net 8 Jun 2018

Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end.

Multi-scale Multi-band DenseNets for Audio Source Separation

Anjok07/ultimatevocalremovergui 29 Jun 2017

This paper deals with the problem of audio source separation.

Adversarial Semi-Supervised Audio Source Separation applied to Singing Voice Extraction

f90/AdversarialAudioSeparation 31 Oct 2017

Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.

Improved Speech Enhancement with the Wave-U-Net

craigmacartney/Wave-U-Net-For-Speech-Enhancement 27 Nov 2018

We study the use of the Wave-U-Net architecture for speech enhancement, a model introduced by Stoller et al for the separation of music vocals and accompaniment.

Learning to Separate Object Sounds by Watching Unlabeled Video

rhgao/Deep-MIML-Network ECCV 2018

Our work is the first to learn audio source separation from large-scale "in the wild" videos containing multiple audio sources per video.

Co-Separating Sounds of Visual Objects

rhgao/co-separation ICCV 2019

Learning how objects sound from video is challenging, since they often heavily overlap in a single audio channel.

Conditioned-U-Net: Introducing a Control Mechanism in the U-Net for Multiple Source Separations

gabolsgabs/cunet 2 Jul 2019

The input vector is embedded to obtain the parameters that control Feature-wise Linear Modulation (FiLM) layers.

Sudo rm -rf: Efficient Networks for Universal Audio Source Separation

etzinis/sudo_rm_rf 14 Jul 2020

In this paper, we present an efficient neural network for end-to-end general purpose audio source separation.

Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures

karnwatcharasupat/directional-sparse-filtering-tf 30 Jan 2021

In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix.

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks

darius522/dnr-utils 19 Oct 2021

The cocktail party problem aims at isolating any source of interest within a complex acoustic scene, and has long inspired audio source separation research.