Audio Source Separation

44 papers with code • 2 benchmarks • 14 datasets

Audio Source Separation is the process of separating a mixture (e.g. a pop band recording) into isolated sounds from individual sources (e.g. just the lead vocals).

Source: Model selection for deep audio source separation via clustering analysis

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Source Separation

Trend	Dataset	Best Model	Paper	Code	Compare
	AudioSet	ST-SED-SEP			See all
	MUSIC (multi-source)	Co-Separation			See all

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Gull: A Generative Multifunctional Audio Codec

no code yet • 7 Apr 2024

We introduce Gull, a generative multifunctional audio codec.

Paper
Add Code

Mixture of Dynamical Variational Autoencoders for Multi-Source Trajectory Modeling and Separation

no code yet • 7 Dec 2023

In this paper, we propose a latent-variable generative model called mixture of dynamical variational autoencoders (MixDVAE) to model the dynamics of a system composed of multiple moving sources.

Paper
Add Code

GASS: Generalizing Audio Source Separation with Large-scale Data

no code yet • 29 Sep 2023

Here, we study a single general audio source separation (GASS) model trained to separate speech, music, and sound events in a supervised fashion with a large-scale dataset.

Paper
Add Code

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

no code yet • CVPR 2023

We propose a self-supervised approach for learning to perform audio source separation in videos based on natural language queries, using only unlabeled video and audio pairs as training data.

Paper
Add Code

Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation

no code yet • 25 Jan 2023

Applying a diffusion model Vocoder that was pretrained to model single-speaker voices on the output of a deterministic separation model leads to state-of-the-art separation results.

Paper
Add Code

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks

no code yet • 14 Dec 2022

In this paper, we focus on the cocktail fork problem, which takes a three-pronged approach to source separation by separating an audio mixture such as a movie soundtrack or podcast into the three broad categories of speech, music, and sound effects (SFX - understood to include ambient noise and natural sound events).

Paper
Add Code

Hyperbolic Audio Source Separation

no code yet • 9 Dec 2022

We introduce a framework for audio source separation using embeddings on a hyperbolic manifold that compactly represent the hierarchical relationship between sound sources and time-frequency features.

Paper
Add Code

Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

no code yet • 28 Nov 2022

This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS).

Paper
Add Code

Learning Audio-Visual Dynamics Using Scene Graphs for Audio Source Separation

no code yet • 29 Oct 2022

In this paper, we propose to use this connection between audio and visual dynamics for solving two challenging tasks simultaneously, namely: (i) separating audio sources from a mixture using visual cues, and (ii) predicting the 3D visual motion of a sounding source using its separated audio.

Paper
Add Code

Hierarchic Temporal Convolutional Network With Cross-Domain Encoder for Music Source Separation

no code yet • IEEE Signal Processing Letters 2022

In this paper, we propose a model which combines the complexed spectrogram domain feature and time-domain feature by a cross-domain encoder (CDE) and adopts the hierarchic temporal convolutional network (HTCN) for multiple music sources separation.

Paper
Add Code

Audio Source Separation

Benchmarks Add a Result

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result