Search Results for author: Thiago D. Simão

Found 9 papers, 4 papers with code

Robust Active Measuring under Model Uncertainty

1 code implementation18 Dec 2023 Merlijn Krale, Thiago D. Simão, Jana Tumova, Nils Jansen

Partial observability and uncertainty are common problems in sequential decision-making that particularly impede the use of formal models such as Markov decision processes (MDPs).

Decision Making

Factored Online Planning in Many-Agent POMDPs

no code implementations18 Dec 2023 Maris F. L. Galesloot, Thiago D. Simão, Sebastian Junges, Nils Jansen

However, the challenges of value estimation and belief estimation have only been tackled individually, which prevents existing methods from scaling to settings with many agents.

Reinforcement Learning by Guided Safe Exploration

no code implementations26 Jul 2023 Qisong Yang, Thiago D. Simão, Nils Jansen, Simon H. Tindemans, Matthijs T. J. Spaan

Drawing from transfer learning, we also regularize a target policy (the student) towards the guide while the student is unreliable and gradually eliminate the influence of the guide as training progresses.

reinforcement-learning Reinforcement Learning (RL) +2

More for Less: Safe Policy Improvement With Stronger Performance Guarantees

1 code implementation13 May 2023 Patrick Wienhöft, Marnix Suilen, Thiago D. Simão, Clemens Dubslaff, Christel Baier, Nils Jansen

In an offline reinforcement learning setting, the safe policy improvement (SPI) problem aims to improve the performance of a behavior policy according to which sample data has been generated.

Act-Then-Measure: Reinforcement Learning for Partially Observable Environments with Active Measuring

1 code implementation14 Mar 2023 Merlijn Krale, Thiago D. Simão, Nils Jansen

In these models, actions consist of two components: a control action that affects the environment, and a measurement action that affects what the agent can observe.

reinforcement-learning Reinforcement Learning (RL)

Decision-Making Under Uncertainty: Beyond Probabilities

no code implementations10 Mar 2023 Thom Badings, Thiago D. Simão, Marnix Suilen, Nils Jansen

In this paper, the focus is on the uncertainty that goes beyond this classical interpretation, particularly by employing a clear distinction between aleatoric and epistemic uncertainty.

Decision Making Decision Making Under Uncertainty

Safe Policy Improvement for POMDPs via Finite-State Controllers

no code implementations12 Jan 2023 Thiago D. Simão, Marnix Suilen, Nils Jansen

In our novel approach to the SPI problem for POMDPs, we assume that a finite-state controller (FSC) represents the behavior policy and that finite memory is sufficient to derive optimal policies.

Reinforcement Learning (RL)

Robust Anytime Learning of Markov Decision Processes

1 code implementation31 May 2022 Marnix Suilen, Thiago D. Simão, David Parker, Nils Jansen

Markov decision processes (MDPs) are formal models commonly used in sequential decision-making.

Bayesian Inference Decision Making

Safe Policy Improvement with an Estimated Baseline Policy

no code implementations11 Sep 2019 Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes

Previous work has shown the unreliability of existing algorithms in the batch Reinforcement Learning setting, and proposed the theoretically-grounded Safe Policy Improvement with Baseline Bootstrapping (SPIBB) fix: reproduce the baseline policy in the uncertain state-action pairs, in order to control the variance on the trained policy performance.

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.