2 code implementations • 29 Sep 2017 • Gregory Kahn, Adam Villaflor, Bosen Ding, Pieter Abbeel, Sergey Levine
To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based.