no code implementations • 12 Mar 2019 • Daan Wout, Jan Scholten, Carlos Celemin, Jens Kober
We demonstrate that the novel algorithm outperforms the current state-of-the-art in final performance, convergence rate and robustness to erroneous feedback in OpenAI Gym continuous control benchmarks, both for simulated and real human teachers.
2 code implementations • 14 Mar 2019 • Jan Scholten, Daan Wout, Carlos Celemin, Jens Kober
We employ binary corrective feedback as a general and intuitive manner to incorporate human intuition and domain knowledge in model-free machine learning.