Stacked What-Where Auto-encoders

8 Jun 2015  ·  Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann Lecun ·

We present a novel architecture, the "stacked what-where auto-encoders" (SWWAE), which integrates discriminative and generative pathways and provides a unified approach to supervised, semi-supervised and unsupervised learning without relying on sampling during training. An instantiation of SWWAE uses a convolutional net (Convnet) (LeCun et al. (1998)) to encode the input, and employs a deconvolutional net (Deconvnet) (Zeiler et al. (2010)) to produce the reconstruction. The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet. Each pooling layer produces two sets of variables: the "what" which are fed to the next layer, and its complementary variable "where" that are fed to the corresponding layer in the generative decoder.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Image Classification CIFAR-10 SWWAE Percentage correct 92.2 # 165
Image Classification CIFAR-100 SWWAE Percentage correct 69.1 # 167
Image Classification MNIST Zhao et al. (2015) (auto-encoder) Percentage error 4.76 # 80
Image Classification STL-10 SWWAE Percentage correct 74.3 # 79
Semi-Supervised Image Classification STL-10, 1000 Labels SWWAE Accuracy 74.30 # 10

Methods


No methods listed for this paper. Add relevant methods here