no code implementations • 18 Nov 2024 • Marnix Suilen, Thom Badings, Eline M. Bovy, David Parker, Nils Jansen
We also discuss how RMDPs relate to other models and how they are used in several contexts, including reinforcement learning and abstraction techniques.
no code implementations • 16 Aug 2024 • Maris F. L. Galesloot, Marnix Suilen, Thiago D. Simão, Steven Carr, Matthijs T. J. Spaan, Ufuk Topcu, Nils Jansen
To compute such robust memory-based policies, we propose the pessimistic iterative planning (PIP) framework, which alternates between two main steps: (1) selecting a pessimistic (non-robust) POMDP via worst-case probability instances from the uncertainty sets; and (2) computing a finite-state controller (FSC) for this pessimistic POMDP.
no code implementations • 17 Jul 2024 • Lisandro A. Jimenez-Roa, Thiago D. Simão, Zaharah Bukhsh, Tiedo Tinga, Hajo Molegraaf, Nils Jansen, Marielle Stoelinga
Large-scale infrastructure systems are crucial for societal welfare, and their effective management requires strategic forecasting and intervention methods that account for various complexities.
no code implementations • 2 Jun 2024 • Thom Badings, Wietze Koops, Sebastian Junges, Nils Jansen
As our main contribution, we significantly accelerate algorithmic approaches for verifying that a neural network is indeed a RASM.
no code implementations • 9 May 2024 • Wietze Koops, Sebastian Junges, Nils Jansen
Our experiments demonstrate the efficacy and scalability of the approach.
1 code implementation • 8 May 2024 • Eline M. Bovy, Marnix Suilen, Sebastian Junges, Nils Jansen
Partially observable Markov decision processes (POMDPs) rely on the key assumption that probability distributions are precisely known.
1 code implementation • 2 Apr 2024 • Thom Badings, Licio Romao, Alessandro Abate, Nils Jansen
To address this issue, we propose a novel abstraction scheme for stochastic linear systems that exploits the system's stability to obtain significantly smaller abstract models.
no code implementations • 18 Dec 2023 • Maris F. L. Galesloot, Thiago D. Simão, Sebastian Junges, Nils Jansen
However, the challenges of value estimation and belief estimation have only been tackled individually, which prevents existing methods from scaling to settings with many agents.
1 code implementation • 18 Dec 2023 • Merlijn Krale, Thiago D. Simão, Jana Tumova, Nils Jansen
Partial observability and uncertainty are common problems in sequential decision-making that particularly impede the use of formal models such as Markov decision processes (MDPs).
no code implementations • 16 Nov 2023 • Thom Badings, Nils Jansen, Licio Romao, Alessandro Abate
Such autonomous systems are naturally modeled as stochastic dynamical models.
no code implementations • 26 Jul 2023 • Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T. J. Spaan
Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses.
1 code implementation • 13 May 2023 • Patrick Wienhöft, Marnix Suilen, Thiago D. Simão, Clemens Dubslaff, Christel Baier, Nils Jansen
In an offline reinforcement learning setting, the safe policy improvement (SPI) problem aims to improve the performance of a behavior policy according to which sample data has been generated.
no code implementations • 1 May 2023 • Thom Badings, Sebastian Junges, Ahmadreza Marandi, Ufuk Topcu, Nils Jansen
As our main contribution, we present an efficient method to compute these partial derivatives.
1 code implementation • 14 Mar 2023 • Merlijn Krale, Thiago D. Simão, Nils Jansen
In these models, actions consist of two components: a control action that affects the environment, and a measurement action that affects what the agent can observe.
no code implementations • 10 Mar 2023 • Thom Badings, Thiago D. Simão, Marnix Suilen, Nils Jansen
In this paper, the focus is on the uncertainty that goes beyond this classical interpretation, particularly by employing a clear distinction between aleatoric and epistemic uncertainty.
no code implementations • 12 Jan 2023 • Thiago D. Simão, Marnix Suilen, Nils Jansen
In our novel approach to the SPI problem for POMDPs, we assume that a finite-state controller (FSC) represents the behavior policy and that finite memory is sufficient to derive optimal policies.
1 code implementation • 4 Jan 2023 • Thom Badings, Licio Romao, Alessandro Abate, David Parker, Hasan A. Poonawala, Marielle Stoelinga, Nils Jansen
This iMDP is, with a user-specified confidence probability, robust against uncertainty in the transition probabilities, and the tightness of the probability intervals can be controlled through the number of samples.
2 code implementations • 10 Dec 2022 • Dennis Gross, Thiago D. Simao, Nils Jansen, Guillermo A. Perez
We use this metric to craft optimal adversarial attacks.
1 code implementation • 12 Oct 2022 • Thom Badings, Licio Romao, Alessandro Abate, Nils Jansen
Stochastic noise causes aleatoric uncertainty, whereas imprecise knowledge of model parameters leads to epistemic uncertainty.
1 code implementation • 2 Oct 2022 • Yannick Hogewind, Thiago D. Simao, Tal Kachman, Nils Jansen
We address the problem of safe reinforcement learning from pixel observations.
2 code implementations • 15 Sep 2022 • Dennis Gross, Nils Jansen, Sebastian Junges, Guillermo A. Perez
This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking.
no code implementations • 1 Aug 2022 • Zaharah A. Bukhsh, Nils Jansen, Hajo Molegraaf
We approach the problem of rehabilitation planning in an online and offline DRL setting.
1 code implementation • 31 May 2022 • Marnix Suilen, Thiago D. Simão, David Parker, Nils Jansen
Markov decision processes (MDPs) are formal models commonly used in sequential decision-making.
no code implementations • 2 Apr 2022 • Steven Carr, Nils Jansen, Sebastian Junges, Ufuk Topcu
Safe exploration is a common problem in reinforcement learning (RL) that aims to prevent agents from making disastrous decisions while exploring their environment.
no code implementations • 25 Oct 2021 • Thom S. Badings, Alessandro Abate, Nils Jansen, David Parker, Hasan A. Poonawala, Marielle Stoelinga
We use state-of-the-art verification techniques to provide guarantees on the iMDP, and compute a controller for which these guarantees carry over to the autonomous system.
1 code implementation • 30 Jun 2021 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
The parameter synthesis problem is to compute an instantiation of these unspecified parameters such that the resulting MDP satisfies the temporal logic specification.
1 code implementation • 3 Mar 2021 • Thom Badings, Hasan A. Poonawala, Marielle Stoelinga, Nils Jansen
By construction, any policy on the abstraction can be refined into a piecewise linear feedback controller for the LTI system.
no code implementations • 7 Feb 2021 • Zaharah A. Bukhsh, Nils Jansen, Aaqib Saeed
We, therefore, evaluate a combination of in-domain and cross-domain transfer learning strategies for damage detection in bridges.
no code implementations • 29 Jan 2021 • Thom S. Badings, Arnd Hartmanns, Nils Jansen, Marnix Suilen
We study a smart grid with wind power and battery storage.
no code implementations • 24 Sep 2020 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu
(3) We linearize this dual problem and (4) solve the resulting finite linear program to obtain locally optimal solutions to the original problem.
no code implementations • 31 Aug 2020 • Ajaya Adhikari, Richard den Hollander, Ioannis Tolios, Michael van Bekkum, Anneloes Bal, Stijn Hendriks, Maarten Kruithof, Dennis Gross, Nils Jansen, Guillermo Pérez, Kit Buurman, Stephan Raaijmakers
The traditional way of hiding military assets from sight is camouflage, for example by using camouflage nets.
no code implementations • 16 Jul 2020 • Leonore Winterer, Ralf Wimmer, Nils Jansen, Bernd Becker
Second, based on the results of the original MILP, we employ a preprocessing of the POMDP to encompass memory-based decisions.
1 code implementation • 30 Jun 2020 • Sebastian Junges, Nils Jansen, Sanjit A. Seshia
Partially-Observable Markov Decision Processes (POMDPs) are a well-known stochastic model for sequential decision making under limited information.
no code implementations • 12 May 2020 • Dennis Gross, Nils Jansen, Guillermo A. Pérez, Stephan Raaijmakers
The robustness-checking problem consists of assessing, given a set of classifiers and a labelled data set, whether there exists a randomized attack that induces a certain expected loss against all classifiers.
no code implementations • 13 Feb 2020 • Steven Carr, Nils Jansen, Ufuk Topcu
Recurrent neural networks (RNNs) have emerged as an effective representation of control policies in sequential decision-making problems.
no code implementations • 1 Aug 2019 • Dung T. Phan, Radu Grosu, Nils Jansen, Nicola Paoletti, Scott A. Smolka, Scott D. Stoller
NSA not only provides safety assurances in the presence of a possibly unsafe neural controller, but can also improve the safety of such a controller in an online setting via retraining, without overly degrading its performance.
no code implementations • 15 May 2019 • Murat Cubuktepe, Nils Jansen, Mohammed Alsiekh, Ufuk Topcu
We design the autonomy protocol to ensure that the resulting robot behavior satisfies given safety and performance specifications in probabilistic temporal logic.
no code implementations • 20 Mar 2019 • Steven Carr, Nils Jansen, Ralf Wimmer, Alexandru C. Serban, Bernd Becker, Ufuk Topcu
The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints.
1 code implementation • 15 Feb 2019 • Milan Ceska, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen
This paper considers large families of Markov chains (MCs) that are defined over a set of parameters with finite discrete domains.
no code implementations • 28 Sep 2018 • Mohamadreza Ahmadi, Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
Then, the deception problem is to compute a strategy for the deceiver that minimizes the expected cost of deception against all strategies of the infiltrator.
no code implementations • 16 Jul 2018 • Nils Jansen, Bettina Könighofer, Sebastian Junges, Alexandru C. Serban, Roderick Bloem
This paper targets the efficient construction of a safety shield for decision making in scenarios that incorporate uncertainty.
no code implementations • 5 Mar 2018 • Murat Cubuktepe, Nils Jansen, Sebastian Junges, Joost-Pieter Katoen, Ufuk Topcu
This paper considers parametric Markov decision processes (pMDPs) whose transitions are equipped with affine functions over a finite set of parameters.
no code implementations • 27 Feb 2018 • Steven Carr, Nils Jansen, Ralf Wimmer, Jie Fu, Ufuk Topcu
The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications.
no code implementations • 14 Aug 2017 • Leonore Winterer, Sebastian Junges, Ralf Wimmer, Nils Jansen, Ufuk Topcu, Joost-Pieter Katoen, Bernd Becker
We study synthesis problems with constraints in partially observable Markov decision processes (POMDPs), where the objective is to compute a strategy for an agent that is guaranteed to satisfy certain safety and performance specifications.
1 code implementation • 28 Oct 2016 • Sebastian Junges, Nils Jansen, Joost-Pieter Katoen, Ufuk Topcu
Probabilistic model checking is used to predict the human's behavior.
no code implementations • 26 Oct 2016 • Nils Jansen, Murat Cubuktepe, Ufuk Topcu
We formalize synthesis of shared control protocols with correctness guarantees for temporal logic specifications.