Variational methods that rely on a recognition network to approximate the
posterior of directed graphical models offer better inference and learning than
previous methods. Recent advances that exploit the capacity and flexibility in
this approach have expanded what kinds of models can be trained...
However, as a
proposal for the posterior, the capacity of the recognition network is limited,
which can constrain the representational power of the generative model and
increase the variance of Monte Carlo estimates. To address these issues, we
introduce an iterative refinement procedure for improving the approximate
posterior of the recognition network and show that training with the refined
posterior is competitive with state-of-the-art methods. The advantages of
refinement are further evident in an increased effective sample size, which
implies a lower variance of gradient estimates.