Speech Separation

94 papers with code • 18 benchmarks • 16 datasets

The task of extracting all overlapping speech sources in a given mixed speech signal refers to the Speech Separation. Speech Separation is a special scenario of source separation problem, where the focus is only on the overlapping speech signal sources and other interferences such as music or noise signals are not the main concern of the study.

Source: A Unified Framework for Speech Separation

Image credit: Speech Separation of A Target Speaker Based on Deep Neural Networks

Libraries

Use these libraries to find Speech Separation models and implementations
10 papers
2,077
3 papers
230
2 papers
7,774
See all 6 libraries.

Most implemented papers

Compute and memory efficient universal sound source separation

etzinis/sudo_rm_rf 3 Mar 2021

Recent progress in audio source separation lead by deep learning has enabled many neural network models to provide robust solutions to this fundamental estimation problem.

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

bill9800/speech_separation 13 Feb 2015

In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.

Single-Channel Multi-Speaker Separation using Deep Clustering

JusperLee/Deep-Clustering-for-Speech-Separation 7 Jul 2016

In this paper we extend the baseline system with an end-to-end signal approximation objective that greatly improves performance on a challenging speech separation.

Two-Step Sound Source Separation: Training on Learned Latent Targets

etzinis/two_step_mask_learning 22 Oct 2019

In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal.

Filterbank design for end-to-end speech separation

mpariente/AsSteroid 23 Oct 2019

Also, we validate the use of parameterized filterbanks and show that complex-valued representations and masks are beneficial in all conditions.

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

yluo42/TAC 30 Oct 2019

An important problem in ad-hoc microphone speech separation is how to guarantee the robustness of a system with respect to the locations and numbers of microphones.

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation

yuhogun0908/MISOnet 4 Oct 2020

Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.

Directional Sparse Filtering using Weighted Lehmer Mean for Blind Separation of Unbalanced Speech Mixtures

karnwatcharasupat/directional-sparse-filtering-tf 30 Jan 2021

In blind source separation of speech signals, the inherent imbalance in the source spectrum poses a challenge for methods that rely on single-source dominance for the estimation of the mixing matrix.

Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation

Zhongyang-debug/Sandglasset-A-Light-Multi-Granularity-Self-Attentive-Network-For-Time-Domain-Speech-Separation 1 Mar 2021

One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers.