SepIt: Approaching a Single Channel Speech Separation Bound

24 May 2022  ·  Shahar Lutati, Eliya Nachmani, Lior Wolf ·

We present an upper bound for the Single Channel Speech Separation task, which is based on an assumption regarding the nature of short segments of speech. Using the bound, we are able to show that while the recent methods have made significant progress for a few speakers, there is room for improvement for five and ten speakers. We then introduce a Deep neural network, SepIt, that iteratively improves the different speakers' estimation. At test time, SpeIt has a varying number of iterations per test sample, based on a mutual information criterion that arises from our analysis. In an extensive set of experiments, SepIt outperforms the state-of-the-art neural networks for 2, 3, 5, and 10 speakers.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Speech Separation Libri10Mix SepIt SI-SDRi 8.2 # 2
Speech Separation Libri5Mix SepIt SI-SDRi 13.7 # 2
Speech Separation WSJ0-2mix SepIt SI-SDRi 22.4 # 7
Speech Separation WSJ0-3mix SepIt SI-SDRi 20.1 # 6

Methods


No methods listed for this paper. Add relevant methods here