Disentangled Representation Learning with Sequential Residual Variational Autoencoder

ICLR 2020 · Nanxiang Li, Shabnam Ghaffarzadegan, Liu Ren ·

Recent advancements in unsupervised disentangled representation learning focus on extending the variational autoencoder (VAE) with an augmented objective function to balance the trade-off between disentanglement and reconstruction. We propose Sequential Residual Variational Autoencoder (SR-VAE) that defines a "Residual learning" mechanism as the training regime instead of the augmented objective function. Our proposed solution deploys two important ideas in a single framework: (1) learning from the residual between the input data and the accumulated reconstruction of sequentially added latent variables; (2) decomposing the reconstruction into decoder output and a residual term. This formulation encourages the disentanglement in the latent space by inducing explicit dependency structure, and reduces the bottleneck of VAE by adding the residual term to facilitate reconstruction. More importantly, SR-VAE eliminates the hyperparameter tuning, a crucial step for the prior state-of-the-art performance using the objective function augmentation approach. We demonstrate both qualitatively and quantitatively that SR-VAE improves the state-of-the-art unsupervised disentangled representation learning on a variety of complex datasets.

PDF Abstract