Learning in models with discrete latent variables is challenging due to high variance gradient estimators.
Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well.
Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network.
#2 best model for Multivariate Time Series Imputation on MuJoCo
This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.
This work considers variational attention networks, alternatives to soft and hard attention for learning latent variable alignment models, with tighter approximation bounds based on amortized variational inference.
#6 best model for Machine Translation on IWSLT2015 German-English