no code implementations • 16 Jul 2023 • Hongyu Ding, Yuanze Tang, Qing Wu, Bo wang, Chunlin Chen, Zhi Wang
Existing reward shaping methods for goal-conditioned RL are typically built on distance metrics with a linear and isotropic distribution, which may fail to provide sufficient information about the ever-changing environment with high complexity.