Curiosity-based reward schemes can present powerful exploration mechanisms which facilitate the discovery of solutions for complex, sparse or long-horizon tasks.
no code implementations • 3 Nov 2020 • Markus Wulfmeier, Arunkumar Byravan, Tim Hertweck, Irina Higgins, Ankush Gupta, tejas kulkarni, Malcolm Reynolds, Denis Teplyashin, Roland Hafner, Thomas Lampe, Martin Riedmiller
Furthermore, the value of each representation is evaluated in terms of three properties: dimensionality, observability and disentanglement.
Modern Reinforcement Learning (RL) algorithms promise to solve difficult motor control problems directly from raw sensory inputs.
no code implementations • 30 Jul 2020 • Markus Wulfmeier, Dushyant Rao, Roland Hafner, Thomas Lampe, Abbas Abdolmaleki, Tim Hertweck, Michael Neunert, Dhruva Tirumala, Noah Siegel, Nicolas Heess, Martin Riedmiller
We introduce Hindsight Off-policy Options (HO2), a data-efficient option learning algorithm.
In particular, we show that a real robotic arm can learn to grasp and lift and solve a Ball-in-a-Cup task from scratch, when only raw sensor streams are used for both controller input and in the auxiliary reward definition.
This is in contrast to the state-of-the-art reinforcement learning agents, which typically start learning each new task from scratch and struggle with knowledge transfer.
no code implementations • 26 Jun 2019 • Markus Wulfmeier, Abbas Abdolmaleki, Roland Hafner, Jost Tobias Springenberg, Michael Neunert, Tim Hertweck, Thomas Lampe, Noah Siegel, Nicolas Heess, Martin Riedmiller
The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements.
no code implementations • 13 Feb 2019 • Devin Schwab, Tobias Springenberg, Murilo F. Martins, Thomas Lampe, Michael Neunert, Abbas Abdolmaleki, Tim Hertweck, Roland Hafner, Francesco Nori, Martin Riedmiller
We present a method for fast training of vision based control policies on real robots.