Playing hard exploration games by watching YouTube

NeurIPS 2018 Yusuf Aytar • Tobias Pfaff • David Budden • Tom Le Paine • Ziyu Wang • Nando de Freitas

Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse. One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. and Private Eye for the first time, even if the agent is not presented with any environment rewards.

