Alternative Objective Functions for Deep Clustering

ICASSP 2018  ·  Wang, Z.-Q.; Le Roux, J.; Hershey, J.R ·

The recently proposed deep clustering framework represents a significant step towards solv-ing the cocktail party problem. This study proposes and compares a variety of alternativeobjective functions for training deep clustering networks. In addition, whereas the originaldeep clustering work relied on k-means clustering for test-time inference, here we investigateinference methods that are matched to the training objective. Furthermore, we explore theuse of an improved chimera network architecture for speech separation, which combines deepclustering with mask-inference networks in a multiobjective training scheme. The deep clus-tering loss acts as a regularizer while training the end-to-end mask inference network for bestseparation. With further iterative phase reconstruction, our best proposed method achievesa state-of-the-art 11.5 dB signal-to-distortion ratio (SDR) result on the publicly availablewsj0-2mix dataset, with a much simpler architecture than the previous best approach.

PDF

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Speech Separation WSJ0-2mix Chimera++ SI-SDRi 11.5 # 29

Methods