QUOTA: The Quantile Option Architecture for Reinforcement Learning

5 Nov 2018 · Shangtong Zhang, Borislav Mavrin, Linglong Kong, Bo Liu, Hengshuai Yao ·

In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL). In QUOTA, decision making is based on quantiles of a value distribution, not only the mean. QUOTA provides a new dimension for exploration via making use of both optimism and pessimism of a value distribution. We demonstrate the performance advantage of QUOTA in both challenging video games and physical robot simulators.

PDF Abstract