1 code implementation • 1 Aug 2024 • Moritz Meser, Aditya Bhatt, Boris Belousov, Jan Peters
We tackle the recently introduced benchmark for whole-body humanoid control HumanoidBench using MuJoCo MPC.
no code implementations • 25 May 2024 • Théo Vincent, Fabian Wahren, Jan Peters, Boris Belousov, Carlo D'Eramo
Deep Reinforcement Learning (RL) is well known for being highly sensitive to hyperparameters, requiring practitioners substantial efforts to optimize them for the problem at hand.
no code implementations • 20 Mar 2024 • Alina Böhm, Tim Schneider, Boris Belousov, Alap Kshirsagar, Lisa Lin, Katja Doerschner, Knut Drewing, Constantin A. Rothkopf, Jan Peters
By evaluating our method on a previously published Active Clothing Perception Dataset and on a real robotic system, we establish that the choice of the active exploration strategy has only a minor influence on the recognition accuracy, whereas data augmentation and dropout rate play a significantly larger role.
no code implementations • 4 Mar 2024 • Théo Vincent, Daniel Palenicek, Boris Belousov, Jan Peters, Carlo D'Eramo
It has been observed that this scheme can be potentially generalized to carry out multiple iterations of the Bellman operator at once, benefiting the underlying learning algorithm.
1 code implementation • 20 Dec 2023 • Théo Vincent, Alberto Maria Metelli, Boris Belousov, Jan Peters, Marcello Restelli, Carlo D'Eramo
We formulate an optimization problem to learn PBO for generic sequential decision-making problems, and we theoretically analyze its properties in two representative classes of RL problems.
1 code implementation • 26 Nov 2022 • Max Siebenborn, Boris Belousov, Junning Huang, Jan Peters
On the other hand, the proposed Decision LSTM is able to achieve expert-level performance on these tasks, in addition to learning a swing-up controller on the real system.
no code implementations • 23 Oct 2022 • Tim Schneider, Boris Belousov, Georgia Chalvatzaki, Diego Romeres, Devesh K. Jha, Jan Peters
Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in recent years.
no code implementations • 1 Jun 2022 • Tim Schneider, Boris Belousov, Hany Abdulsamad, Jan Peters
Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in the last decades.
1 code implementation • 5 Oct 2021 • Michael Lutter, Boris Belousov, Shie Mannor, Dieter Fox, Animesh Garg, Jan Peters
Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs equation, is important as it yields the optimal policy that achieves the maximum reward on a give task.
no code implementations • 29 Mar 2021 • Hany Abdulsamad, Tim Dorau, Boris Belousov, Jia-Jie Zhu, Jan Peters
Trajectory optimization and model predictive control are essential techniques underpinning advanced robotic applications, ranging from autonomous driving to full-body humanoid control.
1 code implementation • 25 Feb 2021 • Pascal Klink, Hany Abdulsamad, Boris Belousov, Carlo D'Eramo, Jan Peters, Joni Pajarinen
Across machine learning, the use of curricula has shown strong empirical potential to improve learning from data by avoiding local optima of training objectives.
1 code implementation • 8 Oct 2019 • Matthias Schultheis, Boris Belousov, Hany Abdulsamad, Jan Peters
Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion.
1 code implementation • 7 Oct 2019 • Pascal Klink, Hany Abdulsamad, Boris Belousov, Jan Peters
Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots.
no code implementations • 13 Sep 2019 • Michael Lutter, Boris Belousov, Kim Listmann, Debora Clever, Jan Peters
The corresponding optimal value function is learned end-to-end by embedding a deep differential network in the Hamilton-Jacobi-Bellmann differential equation and minimizing the error of this equality while simultaneously decreasing the discounting from short- to far-sighted to enable the learning.
no code implementations • 6 Jul 2019 • Boris Belousov, Jan Peters
An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration.
no code implementations • 21 Jun 2019 • David Nass, Boris Belousov, Jan Peters
With the increasing pace of automation, modern robotic systems need to act in stochastic, non-stationary, partially observable environments.
3 code implementations • 14 Feb 2019 • Aditya Bhatt, Daniel Palenicek, Boris Belousov, Max Argus, Artemij Amiranashvili, Thomas Brox, Jan Peters
Sample efficiency is a crucial problem in deep reinforcement learning.
1 code implementation • 29 Dec 2017 • Boris Belousov, Jan Peters
We carry out asymptotic analysis of the solutions for different values of $\alpha$ and demonstrate the effects of using different divergence functions on a multi-armed bandit problem and on common standard reinforcement learning problems.
no code implementations • NeurIPS 2016 • Boris Belousov, Gerhard Neumann, Constantin A. Rothkopf, Jan R. Peters
In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty.