Speaker Separation
11 papers with code • 0 benchmarks • 3 datasets
Benchmarks
These leaderboards are used to track progress in Speaker Separation
Most implemented papers
Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
The majority of the previous methods have formulated the separation problem through the time-frequency representation of the mixed signal, which has several drawbacks, including the decoupling of the phase and magnitude of the signal, the suboptimality of time-frequency representation for speech separation, and the long latency in calculating the spectrograms.
VoiceFilter: Targeted Voice Separation by Speaker-Conditioned Spectrogram Masking
In this paper, we present a novel system that separates the voice of a target speaker from multi-speaker signals, by making use of a reference signal from the target speaker.
Single-Channel Multi-Speaker Separation using Deep Clustering
In this paper we extend the baseline system with an end-to-end signal approximation objective that greatly improves performance on a challenging speech separation.
Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation
Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.
Deep attractor network for single-microphone speaker separation
We propose a novel deep learning framework for single channel speech separation by creating attractor points in high dimensional embedding space of the acoustic signals which pull together the time-frequency bins corresponding to each source.
Monaural Audio Speaker Separation with Source Contrastive Estimation
Although the matrix determined by the output weights is dependent on a set of known speakers, we only use the input vectors during inference.
Neural separation of observed and unobserved distributions
In this work, we introduce a new method---Neural Egg Separation---to tackle the scenario of extracting a signal from an unobserved distribution additively mixed with a signal from an observed distribution.
Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation
Simultaneous grouping is first performed in each time frame by separating the spectra of different speakers with a permutation-invariantly trained neural network.
Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss
We have open sourced our re-implementation of the DPRNN-TasNet here (https://github. com/ShiZiqiang/dual-path-RNNs-DPRNNs-based-speech-separation), and our TasTas is realized based on this implementation of DPRNN-TasNet, it is believed that the results in this paper can be reproduced with ease.
Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training
Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired.