Search Results for author: Boris Belousov

Found 19 papers, 9 papers with code

MuJoCo MPC for Humanoid Control: Evaluation on HumanoidBench

1 code implementation1 Aug 2024 Moritz Meser, Aditya Bhatt, Boris Belousov, Jan Peters

We tackle the recently introduced benchmark for whole-body humanoid control HumanoidBench using MuJoCo MPC.

Humanoid Control

Adaptive $Q$-Network: On-the-fly Target Selection for Deep Reinforcement Learning

no code implementations25 May 2024 Théo Vincent, Fabian Wahren, Jan Peters, Boris Belousov, Carlo D'Eramo

Deep Reinforcement Learning (RL) is well known for being highly sensitive to hyperparameters, requiring practitioners substantial efforts to optimize them for the problem at hand.

AutoML reinforcement-learning +1

What Matters for Active Texture Recognition With Vision-Based Tactile Sensors

no code implementations20 Mar 2024 Alina Böhm, Tim Schneider, Boris Belousov, Alap Kshirsagar, Lisa Lin, Katja Doerschner, Knut Drewing, Constantin A. Rothkopf, Jan Peters

By evaluating our method on a previously published Active Clothing Perception Dataset and on a real robotic system, we establish that the choice of the active exploration strategy has only a minor influence on the recognition accuracy, whereas data augmentation and dropout rate play a significantly larger role.

Data Augmentation

Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

no code implementations4 Mar 2024 Théo Vincent, Daniel Palenicek, Boris Belousov, Jan Peters, Carlo D'Eramo

It has been observed that this scheme can be potentially generalized to carry out multiple iterations of the Bellman operator at once, benefiting the underlying learning algorithm.

Atari Games Continuous Control +1

Parameterized Projected Bellman Operator

1 code implementation20 Dec 2023 Théo Vincent, Alberto Maria Metelli, Boris Belousov, Jan Peters, Marcello Restelli, Carlo D'Eramo

We formulate an optimization problem to learn PBO for generic sequential decision-making problems, and we theoretically analyze its properties in two representative classes of RL problems.

Decision Making Reinforcement Learning (RL)

How Crucial is Transformer in Decision Transformer?

1 code implementation26 Nov 2022 Max Siebenborn, Boris Belousov, Junning Huang, Jan Peters

On the other hand, the proposed Decision LSTM is able to achieve expert-level performance on these tasks, in addition to learning a swing-up controller on the real system.

Continuous Control Decision Making

Active Exploration for Robotic Manipulation

no code implementations23 Oct 2022 Tim Schneider, Boris Belousov, Georgia Chalvatzaki, Diego Romeres, Devesh K. Jha, Jan Peters

Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in recent years.

Model-based Reinforcement Learning Model Predictive Control

Active Inference for Robotic Manipulation

no code implementations1 Jun 2022 Tim Schneider, Boris Belousov, Hany Abdulsamad, Jan Peters

Robotic manipulation stands as a largely unsolved problem despite significant advances in robotics and machine learning in the last decades.

Continuous-Time Fitted Value Iteration for Robust Policies

1 code implementation5 Oct 2021 Michael Lutter, Boris Belousov, Shie Mannor, Dieter Fox, Animesh Garg, Jan Peters

Especially for continuous control, solving this differential equation and its extension the Hamilton-Jacobi-Isaacs equation, is important as it yields the optimal policy that achieves the maximum reward on a give task.

Continuous Control

Distributionally Robust Trajectory Optimization Under Uncertain Dynamics via Relative Entropy Trust-Regions

no code implementations29 Mar 2021 Hany Abdulsamad, Tim Dorau, Boris Belousov, Jia-Jie Zhu, Jan Peters

Trajectory optimization and model predictive control are essential techniques underpinning advanced robotic applications, ranging from autonomous driving to full-body humanoid control.

Autonomous Driving Humanoid Control +1

A Probabilistic Interpretation of Self-Paced Learning with Applications to Reinforcement Learning

1 code implementation25 Feb 2021 Pascal Klink, Hany Abdulsamad, Boris Belousov, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Across machine learning, the use of curricula has shown strong empirical potential to improve learning from data by avoiding local optima of training objectives.

reinforcement-learning Reinforcement Learning (RL)

Receding Horizon Curiosity

1 code implementation8 Oct 2019 Matthias Schultheis, Boris Belousov, Hany Abdulsamad, Jan Peters

Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion.

Efficient Exploration Experimental Design +1

Self-Paced Contextual Reinforcement Learning

1 code implementation7 Oct 2019 Pascal Klink, Hany Abdulsamad, Boris Belousov, Jan Peters

Generalization and adaptation of learned skills to novel situations is a core requirement for intelligent autonomous robots.

reinforcement-learning Reinforcement Learning (RL)

HJB Optimal Feedback Control with Deep Differential Value Functions and Action Constraints

no code implementations13 Sep 2019 Michael Lutter, Boris Belousov, Kim Listmann, Debora Clever, Jan Peters

The corresponding optimal value function is learned end-to-end by embedding a deep differential network in the Hamilton-Jacobi-Bellmann differential equation and minimizing the error of this equality while simultaneously decreasing the discounting from short- to far-sighted to enable the learning.

reinforcement-learning Reinforcement Learning (RL)

Entropic Regularization of Markov Decision Processes

no code implementations6 Jul 2019 Boris Belousov, Jan Peters

An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration.

Entropic Risk Measure in Policy Search

no code implementations21 Jun 2019 David Nass, Boris Belousov, Jan Peters

With the increasing pace of automation, modern robotic systems need to act in stochastic, non-stationary, partially observable environments.

Policy Gradient Methods

f-Divergence constrained policy improvement

1 code implementation29 Dec 2017 Boris Belousov, Jan Peters

We carry out asymptotic analysis of the solutions for different values of $\alpha$ and demonstrate the effects of using different divergence functions on a multi-armed bandit problem and on common standard reinforcement learning problems.

Catching heuristics are optimal control policies

no code implementations NeurIPS 2016 Boris Belousov, Gerhard Neumann, Constantin A. Rothkopf, Jan R. Peters

In this paper, we show that interception strategies appearing to be heuristics can be understood as computational solutions to the optimal control problem faced by a ball-catching agent acting under uncertainty.

Cannot find the paper you are looking for? You can Submit a new open access paper.