Semi-Supervised Translation with MMD Networks

28 Oct 2018 · Mark Hamilton ·

This work aims to improve semi-supervised learning in a neural network architecture by introducing a hybrid supervised and unsupervised cost function. The unsupervised component is trained using a differentiable estimator of the Maximum Mean Discrepancy (MMD) distance between the network output and the target dataset. We introduce the notion of an $n$-channel network and several methods to improve performance of these nets based on supervised pre-initialization, and multi-scale kernels. This work investigates the effectiveness of these methods on language translation where very few quality translations are known \textit{a priori}. We also present a thorough investigation of the hyper-parameter space of this method on both synthetic data.

PDF Abstract