Streamlining EM into Auto-Encoder Networks
We present a new deep neural network architecture, named EDGaM, for deep clustering. This architecture can seamlessly learn deep auto-encoders and capture common group features of complex inputs in the encoded latent space. The key idea is to introduce a differentiable Gaussian mixture neural network between an encoder and a decoder. In particular, EDGaM streamlines the iterative Expectation-Maximum (EM) algorithm of the Gaussian mixture models into network design and replaces the alternative update with a forward-backward optimization. Being differentiable, both network weights and clustering centroids in EDGaM can be learned simultaneously in an end-to-end manner through standard stochastic gradient descent. To avoid preserving too many sample-specific details, we use both the clustering centroid and the original latent embedding for decoding. Meanwhile, we distill the soft clustering assignment for each sample via entropy minimization such that a clear cluster structure is exhibited. Our experiments show that our method outperforms state-of-the-art unsupervised clustering techniques in terms of both efficiency and clustering performance.
PDF Abstract