At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical.
SOTA for Multi-Armed Bandits on Mushroom
To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers' answers.
We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space.
By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.
#5 best model for Machine Translation on WMT2016 German-English
We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.