Regulatory Focus: Promotion and Prevention Inclinations in Policy Search

25 Sep 2019  ·  Lanxin Lei, Zhizhong Li, Xiaoyang Li, Cong Qiu, Dahua Lin ·

The estimation of advantage is crucial for a number of reinforcement learning algorithms, as it directly influences the choices of future paths. In this work, we propose a family of estimates based on the order statistics over the path ensemble, which allows one to flexibly drive the learning process in a promotion focus or prevention focus. On top of this formulation, we systematically study the impacts of different regulatory focuses. Our findings reveal that regulatory focus, when chosen appropriately, can result in significant benefits. In particular, for the environments with sparse rewards, promotion focus would lead to more efficient exploration of the policy space; while for those where individual actions can have critical impacts, prevention focus is preferable. On various benchmarks, including MuJoCo continuous control, Terrain locomotion, Atari games, and sparse-reward environments, the proposed schemes consistently demonstrate improvement over mainstream methods, not only accelerating the learning process but also obtaining substantial performance gains.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here