Disentangled Representations using Trained Models
We propose a novel method to learn disentangled representations. The ability to compute a disentangled representation is useful for many tasks because it contains information about samples from a dataset in an interpretable and compact structure. Thus development of a method that learns disentangled representations is an active area of research. In contrast to previously proposed methods, we neither require access to the values of the interpretable factors, nor to information about groups of data samples which share the values of some interpretable factors. Our proposed algorithm uses only a set of models which already have been trained on the data. With the help of the implicit function theorem we show how, using a diverse set of models that have already been trained on the data, to select a pair of data points that have a common value of interpretable factors. We prove that such an auxiliary sampler is sufficient to obtain a disentangled representation. Based on this theoretical result, we propose a loss function that the method should optimize to compute the disentangled representation. Our approach is easy to implement and shows promising results in simulations.
PDF Abstract