Speaker Separation
10 papers with code • 0 benchmarks • 3 datasets
Benchmarks
These leaderboards are used to track progress in Speaker Separation
Most implemented papers
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.
Single-Channel Multi-Speaker Separation using Deep Clustering
In this paper we extend the baseline system with an end-to-end signal approximation objective that greatly improves performance on a challenging speech separation.
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
Monaural Audio Speaker Separation with Source Contrastive Estimation
Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference.
Neural separation of observed and unobserved distributions
In this work, we introduce a new method---Neural Egg Separation---to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution.
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network.
Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss
We have open sourced our re-implementation of the DPRNN-TasNet here (https://github. com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation), and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired.
Blind Speech Separation and Dereverberation using Neural Beamforming
In this paper, we present the Blind Speech Separation and Dereverberation (BSSD) network, which performs simultaneous speaker separation, dereverberation and speaker identification in a single neural network.