1 code implementation • 4 Jun 2021 • Alejandro Daniel Noel, Charel van Hoof, Beren Millidge
Our model is capable of solving sparse-reward problems with a very high sample efficiency due to its objective function, which encourages directed exploration of uncertain states.