Browse > Speech > Speech Separation

Speech Separation

18 papers with code · Speech

Leaderboards

Greatest papers with code

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

20 Sep 2018facebookresearch/demucs

The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.

SPEAKER SEPARATION SPEECH SEPARATION

Deep learning for monaural speech separation

ICASSP 2014 posenhuang/deeplearningsourceseparation

We propose the joint optimization of the deep learning models (deep neural networks and recurrent neural networks) with an extra masking layer, which enforces a reconstruction constraint.

MULTI-SPEAKER SOURCE SEPARATION SPEECH SEPARATION

Multi-talker Speech Separation with Utterance-level Permutation Invariant Training of Deep Recurrent Neural Networks

18 Mar 2017snsun/pit-speech-separation

We evaluated uPIT on the WSJ0 and Danish two- and three-talker mixed-speech separation tasks and found that uPIT outperforms techniques based on Non-negative Matrix Factorization (NMF) and Computational Auditory Scene Analysis (CASA), and compares favorably with Deep Clustering (DPCL) and the Deep Attractor Network (DANet).

SPEECH SEPARATION

Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation

10 Apr 2018bill9800/speech_separation

Solving this task using only audio as input is extremely challenging and does not provide an association of the separated speech signals with speakers in the video.

SPEECH SEPARATION

Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

13 Feb 2015bill9800/speech_separation

In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising.

DENOISING SPEECH SEPARATION

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation

14 Oct 2019JusperLee/Dual-Path-RNN-Pytorch

Recent studies in deep learning-based speech separation have proven the superiority of time-domain approaches to conventional time-frequency-based methods.

SPEECH SEPARATION

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments

6 Nov 2018dr-pato/audio_visual_speech_enhancement

In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available.

SPEECH ENHANCEMENT SPEECH SEPARATION

Deep Recurrent NMF for Speech Separation by Unfolding Iterative Thresholding

21 Sep 2017stwisdom/dr-nmf

This interpretability also provides principled initializations that enable faster training and convergence to better solutions compared to conventional random initialization.

SPEECH SEPARATION

End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation

30 Oct 2019yluo42/TAC

An important problem in ad-hoc microphone speech separation is how to guarantee the robustness of a system with respect to the locations and numbers of microphones.

SPEECH SEPARATION

Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation

25 Apr 2019yuzhou-git/deep-casa

Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network.

SPEAKER SEPARATION SPEECH SEPARATION