Playing hard exploration games by watching YouTube

NeurIPS 2018 Yusuf Aytar • Tobias Pfaff • David Budden • Tom Le Paine • Ziyu Wang • Nando de Freitas

Deep reinforcement learning methods traditionally struggle with tasks where environment rewards are particularly sparse. One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. and Private Eye for the first time, even if the agent is not presented with any environment rewards.

Full paper


No evaluation results yet. Help compare this paper to other papers by submitting the tasks and evaluation metrics from the paper.