1 code implementation • ICLR 2022 • Zhizhou Ren, Ruihan Guo, Yuan Zhou, Jian Peng
Based on this framework, this paper proposes a novel reward redistribution algorithm, randomized return decomposition (RRD), to learn a proxy reward function for episodic reinforcement learning.