Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

29 Oct 2020  ·  Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-Yi Lee ·

Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired. In this paper, we propose to perform self-supervised pre-training to stabilize the label assignment in training the speech separation model. Experiments over several types of self-supervised approaches, several typical speech separation models and two different datasets showed that very good improvements are achievable if a proper self-supervised approach is chosen.

PDF Abstract

Results from the Paper


Ranked #6 on Speech Separation on Libri2Mix (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Speech Separation Libri2Mix Conv-Tasnet (Libri1Mix speech enhancement pre-trained) SI-SDRi 14.1 # 6
SDRi 14.6 # 1
Speech Separation Libri2Mix Conv-Tasnet (Libri1Mix speech enhancement multi-task) SI-SDRi 13.7 # 7
SDRi 14.1 # 2
Speech Separation Libri2Mix Conv-Tasnet SI-SDRi 13.2 # 8
SDRi 13.6 # 3
Speech Separation WSJ0-2mix DPTNet (Libri1Mix speech enhancement pre-trained) SI-SDRi 21.3 # 11
SDRi 21.5 # 3

Methods


No methods listed for this paper. Add relevant methods here