Audio Source Separation
39 papers with code • 2 benchmarks • 12 datasets
Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).
Models for audio source separation usually operate on the magnitude spectrum, which ignores phase information and makes separation performance dependant on hyper-parameters for the spectral front-end.
Based on this idea, we drive the separator towards outputs deemed as realistic by discriminator networks that are trained to tell apart real from separator samples.
Our work is the first to learn audio source separation from large-scale "in the wild" videos containing multiple audio sources per video.
The input vector is embedded to obtain the parameters that control Feature-wise Linear Modulation (FiLM) layers.
Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures
In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix.
The cocktail party problem aims at isolating any source of interest within a complex acoustic scene, and has long inspired audio source separation research.