no code implementations • ICML Workshop URL 2021 • Mirco Mutti, Stefano Del Col, Marcello Restelli
In this paper, we seek for a reward-free compression of the policy space into a finite set of representative policies, such that, given any policy $\pi$, the minimum R\'enyi divergence between the state-action distributions of the representative policies and the state-action distribution of $\pi$ is bounded.