Search Results for author: Steffen Udluft

Found 21 papers, 8 papers with code

Model-based Offline Quantum Reinforcement Learning

no code implementations14 Apr 2024 Simon Eisenmann, Daniel Hein, Steffen Udluft, Thomas A. Runkler

The policy is optimized with a gradient-free optimization scheme using the return estimate given by the model as the fitness function.

reinforcement-learning

Learning Control Policies for Variable Objectives from Offline Data

no code implementations11 Aug 2023 Marc Weber, Phillip Swazinna, Daniel Hein, Steffen Udluft, Volkmar Sterzing

Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available.

reinforcement-learning

Automatic Trade-off Adaptation in Offline RL

no code implementations16 Jun 2023 Phillip Swazinna, Steffen Udluft, Thomas Runkler

Recently, offline RL algorithms have been proposed that remain adaptive at runtime.

Offline RL

Safe Policy Improvement Approaches and their Limitations

1 code implementation1 Aug 2022 Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft

Based on this finding, we develop adaptations, the Adv-Soft-SPIBB algorithms, and show that they are provably safe.

User-Interactive Offline Reinforcement Learning

1 code implementation21 May 2022 Phillip Swazinna, Steffen Udluft, Thomas Runkler

At the same time, offline RL algorithms are not able to tune their most important hyperparameter - the proximity of the learned policy to the original policy.

Offline RL reinforcement-learning +1

Safe Policy Improvement Approaches on Discrete Markov Decision Processes

1 code implementation28 Jan 2022 Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft

Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy.

Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning

1 code implementation14 Jan 2022 Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler

Offline reinforcement learning (RL) Algorithms are often designed with environments such as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists.

Offline RL reinforcement-learning +1

Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

no code implementations26 Nov 2021 Phillip Swazinna, Steffen Udluft, Thomas Runkler

Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones.

reinforcement-learning Reinforcement Learning (RL)

Overcoming Model Bias for Robust Offline Deep Reinforcement Learning

no code implementations12 Aug 2020 Phillip Swazinna, Steffen Udluft, Thomas Runkler

State-of-the-art reinforcement learning algorithms mostly rely on being allowed to directly interact with their environment to collect millions of observations.

Continuous Control Offline RL +2

Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

no code implementations29 Apr 2018 Daniel Hein, Steffen Udluft, Thomas A. Runkler

Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications.

reinforcement-learning Reinforcement Learning (RL)

Interpretable Policies for Reinforcement Learning by Genetic Programming

no code implementations12 Dec 2017 Daniel Hein, Steffen Udluft, Thomas A. Runkler

Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples.

regression reinforcement-learning +2

Sensitivity Analysis for Predictive Uncertainty in Bayesian Neural Networks

no code implementations10 Dec 2017 Stefan Depeweg, José Miguel Hernández-Lobato, Steffen Udluft, Thomas Runkler

We derive a novel sensitivity analysis of input variables for predictive epistemic and aleatoric uncertainty.

Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning

1 code implementation ICML 2018 Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

Bayesian neural networks with latent variables are scalable and flexible probabilistic models: They account for uncertainty in the estimation of the network weights and, by making use of latent variables, can capture complex noise patterns in the data.

Active Learning Decision Making +2

A Benchmark Environment Motivated by Industrial Control Problems

2 code implementations27 Sep 2017 Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand.

OpenAI Gym Reinforcement Learning (RL)

Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables

no code implementations26 Jun 2017 Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

Bayesian neural networks (BNNs) with latent variables are probabilistic models which can automatically identify complex stochastic patterns in the data.

Active Learning reinforcement-learning +2

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

no code implementations20 May 2017 Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting.

reinforcement-learning Reinforcement Learning (RL)

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks

2 code implementations23 May 2016 Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning.

Model-based Reinforcement Learning reinforcement-learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.