Search Results for author: Steffen Udluft

Found 21 papers, 8 papers with code

Model-based Offline Quantum Reinforcement Learning

no code implementations • 14 Apr 2024 • Simon Eisenmann, Daniel Hein, Steffen Udluft, Thomas A. Runkler

The policy is optimized with a gradient-free optimization scheme using the return estimate given by the model as the fitness function.

reinforcement-learning

Paper
Add Code

Learning Control Policies for Variable Objectives from Offline Data

no code implementations • 11 Aug 2023 • Marc Weber, Phillip Swazinna, Daniel Hein, Steffen Udluft, Volkmar Sterzing

Offline reinforcement learning provides a viable approach to obtain advanced control strategies for dynamical systems, in particular when direct interaction with the environment is not available.

reinforcement-learning

Paper
Add Code

Automatic Trade-off Adaptation in Offline RL

no code implementations • 16 Jun 2023 • Phillip Swazinna, Steffen Udluft, Thomas Runkler

Recently, offline RL algorithms have been proposed that remain adaptive at runtime.

Offline RL

Paper
Add Code

Safe Policy Improvement Approaches and their Limitations

1 code implementation • 1 Aug 2022 • Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft

Based on this finding, we develop adaptations, the Adv-Soft-SPIBB algorithms, and show that they are provably safe.

Paper
Code

Quantum Policy Iteration via Amplitude Estimation and Grover Search -- Towards Quantum Advantage for Reinforcement Learning

no code implementations • 9 Jun 2022 • Simon Wiedemann, Daniel Hein, Steffen Udluft, Christian Mendl

We present a full implementation and simulation of a novel quantum reinforcement learning method.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

User-Interactive Offline Reinforcement Learning

1 code implementation • 21 May 2022 • Phillip Swazinna, Steffen Udluft, Thomas Runkler

At the same time, offline RL algorithms are not able to tune their most important hyperparameter - the proximity of the learned policy to the original policy.

Offline RL reinforcement-learning +1

116

Paper
Code

Safe Policy Improvement Approaches on Discrete Markov Decision Processes

1 code implementation • 28 Jan 2022 • Philipp Scholl, Felix Dietrich, Clemens Otte, Steffen Udluft

Safe Policy Improvement (SPI) aims at provable guarantees that a learned policy is at least approximately as good as a given baseline policy.

Paper
Code

Comparing Model-free and Model-based Algorithms for Offline Reinforcement Learning

1 code implementation • 14 Jan 2022 • Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler

Offline reinforcement learning (RL) Algorithms are often designed with environments such as MuJoCo in mind, in which the planning horizon is extremely long and no noise exists.

Offline RL reinforcement-learning +1

116

Paper
Code

Measuring Data Quality for Dataset Selection in Offline Reinforcement Learning

no code implementations • 26 Nov 2021 • Phillip Swazinna, Steffen Udluft, Thomas Runkler

Recently developed offline reinforcement learning algorithms have made it possible to learn policies directly from pre-collected datasets, giving rise to a new dilemma for practitioners: Since the performance the algorithms are able to deliver depends greatly on the dataset that is presented to them, practitioners need to pick the right dataset among the available ones.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Behavior Constraining in Weight Space for Offline Reinforcement Learning

1 code implementation • 12 Jul 2021 • Phillip Swazinna, Steffen Udluft, Daniel Hein, Thomas Runkler

In offline reinforcement learning, a policy needs to be learned from a single pre-collected dataset.

reinforcement-learning Reinforcement Learning (RL)

116

Paper
Code

Overcoming Model Bias for Robust Offline Deep Reinforcement Learning

no code implementations • 12 Aug 2020 • Phillip Swazinna, Steffen Udluft, Thomas Runkler

State-of-the-art reinforcement learning algorithms mostly rely on being allowed to directly interact with their environment to collect millions of observations.

Continuous Control Offline RL +2

Paper
Add Code

Generating Interpretable Fuzzy Controllers using Particle Swarm Optimization and Genetic Programming

no code implementations • 29 Apr 2018 • Daniel Hein, Steffen Udluft, Thomas A. Runkler

Autonomously training interpretable control strategies, called policies, using pre-existing plant trajectory data is of great interest in industrial applications.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Interpretable Policies for Reinforcement Learning by Genetic Programming

no code implementations • 12 Dec 2017 • Daniel Hein, Steffen Udluft, Thomas A. Runkler

Here we introduce the genetic programming for reinforcement learning (GPRL) approach based on model-based batch reinforcement learning and genetic programming, which autonomously learns policy equations from pre-existing default state-action trajectory samples.

regression reinforcement-learning +2

Paper
Add Code

Sensitivity Analysis for Predictive Uncertainty in Bayesian Neural Networks

no code implementations • 10 Dec 2017 • Stefan Depeweg, José Miguel Hernández-Lobato, Steffen Udluft, Thomas Runkler

We derive a novel sensitivity analysis of input variables for predictive epistemic and aleatoric uncertainty.

Paper
Add Code

Decomposition of Uncertainty in Bayesian Deep Learning for Efficient and Risk-sensitive Learning

1 code implementation • ICML 2018 • Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

Bayesian neural networks with latent variables are scalable and flexible probabilistic models: They account for uncertainty in the estimation of the network weights and, by making use of latent variables, can capture complex noise patterns in the data.

Active Learning Decision Making +2

Paper
Code

A Benchmark Environment Motivated by Industrial Control Problems

2 code implementations • 27 Sep 2017 • Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand.

OpenAI Gym Reinforcement Learning (RL)

116

Paper
Code

Uncertainty Decomposition in Bayesian Neural Networks with Latent Variables

no code implementations • 26 Jun 2017 • Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

Bayesian neural networks (BNNs) with latent variables are probabilistic models which can automatically identify complex stochastic patterns in the data.

Active Learning reinforcement-learning +2

Paper
Add Code

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

no code implementations • 20 May 2017 • Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Particle Swarm Optimization for Generating Interpretable Fuzzy Reinforcement Learning Policies

no code implementations • 19 Oct 2016 • Daniel Hein, Alexander Hentschel, Thomas Runkler, Steffen Udluft

To the best of our knowledge, this approach is the first to relate self-organizing fuzzy controllers to model-based batch RL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Introduction to the "Industrial Benchmark"

no code implementations • 12 Oct 2016 • Daniel Hein, Alexander Hentschel, Volkmar Sterzing, Michel Tokic, Steffen Udluft

A novel reinforcement learning benchmark, called Industrial Benchmark, is introduced.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Learning and Policy Search in Stochastic Dynamical Systems with Bayesian Neural Networks

2 code implementations • 23 May 2016 • Stefan Depeweg, José Miguel Hernández-Lobato, Finale Doshi-Velez, Steffen Udluft

We present an algorithm for model-based reinforcement learning that combines Bayesian neural networks (BNNs) with random roll-outs and stochastic optimization for policy learning.

Model-based Reinforcement Learning reinforcement-learning +2

116

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.