1 code implementation • 30 Jan 2023 • Alexandra Cimpean, Timothy Verstraeten, Lander Willem, Niel Hens, Ann Nowé, Pieter Libin
$m$-top exploration allows the algorithm to learn $m$ policies for which it expects the highest utility, enabling experts to inspect this small set of alternative strategies, along with their quantified uncertainty.
no code implementations • 1 Jul 2022 • Conor F. Hayes, Timothy Verstraeten, Diederik M. Roijers, Enda Howley, Patrick Mannion
In such settings a set of optimal policies must be computed.
no code implementations • 2 Jun 2021 • Conor F. Hayes, Timothy Verstraeten, Diederik M. Roijers, Enda Howley, Patrick Mannion
In this case, to apply multi-objective reinforcement learning, the expected utility of the returns must be optimised.
1 code implementation • 17 Mar 2021 • Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives.
1 code implementation • 19 Jan 2021 • Timothy Verstraeten, Pieter-Jan Daems, Eugenio Bargiacchi, Diederik M. Roijers, Pieter J. K. Libin, Jan Helsen
This is a non-trivial optimization problem, as complex dependencies exist between the wind turbines.
1 code implementation • 14 Nov 2020 • Roxana Rădulescu, Timothy Verstraeten, Yijie Zhang, Patrick Mannion, Diederik M. Roijers, Ann Nowé
We contribute novel actor-critic and policy gradient formulations to allow reinforcement learning of mixed strategies in this setting, along with extensions that incorporate opponent policy reconstruction and learning with opponent learning awareness (i. e., learning while considering the impact of one's policy when anticipating the opponent's learning step).
1 code implementation • 30 Mar 2020 • Pieter Libin, Arno Moonens, Timothy Verstraeten, Fabian Perez-Sanjines, Niel Hens, Philippe Lemey, Ann Nowé
For this reason, we investigate a deep reinforcement learning approach to automatically learn prevention strategies in the context of pandemic influenza.
no code implementations • 15 Jan 2020 • Eugenio Bargiacchi, Timothy Verstraeten, Diederik M. Roijers, Ann Nowé
We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes.
Model-based Reinforcement Learning Multi-agent Reinforcement Learning +3
1 code implementation • 22 Nov 2019 • Timothy Verstraeten, Eugenio Bargiacchi, Pieter JK Libin, Jan Helsen, Diederik M. Roijers, Ann Nowé
In this task, wind turbines must coordinate their alignments with respect to the incoming wind vector in order to optimize power production.
1 code implementation • 22 Nov 2019 • Timothy Verstraeten, Pieter JK Libin, Ann Nowé
In many settings, as for example wind farms, multiple machines are instantiated to perform the same task, which is called a fleet.
no code implementations • 30 Sep 2019 • Felipe Gomez Marulanda, Pieter Libin, Timothy Verstraeten, Ann Nowé
In general, our approach outperforms PointNet on every family of 3D geometries on which the models were tested.
no code implementations • ICML 2018 • Eugenio Bargiacchi, Timothy Verstraeten, Diederik Roijers, Ann Nowé, Hado Hasselt
Learning to coordinate between multiple agents is an important problem in many reinforcement learning problems.
no code implementations • 16 Nov 2017 • Pieter Libin, Timothy Verstraeten, Diederik M. Roijers, Jelena Grujic, Kristof Theys, Philippe Lemey, Ann Nowé
We evaluate these algorithms in a realistic experimental setting and demonstrate that it is possible to identify the optimal strategy using only a limited number of model evaluations, i. e., 2-to-3 times faster compared to the uniform sampling method, the predominant technique used for epidemiological decision making in the literature.