How to Not Measure Disentanglement

To evaluate disentangled representations several metrics have been proposed. However, theoretical guarantees for conventional metrics of disentanglement are missing. Moreover, conventional metrics do not have a consistent correlation with the outcomes of qualitative studies. In this paper we analyze metrics of disentanglement and their properties. We conclude that existing metrics of disentanglement were created to reflect different characteristics of disentanglement and do not satisfy two basic desirable properties: (1) assign a high score to representations that are disentangled according to the definition; and (2) assign a low score to representations that are entangled according to the definition. In addition, we propose a new metric of disentanglement and prove that it satisfies both of the properties.

