1 code implementation • 17 Sep 2021 • Guillermo Grande, Dhruv Batra, Erik Wijmans
Under this setting, the agent incurs a penalty for using this privileged information, encouraging the agent to only leverage this information when it is crucial to learning.