Search Results for author: Diederik M. Roijers

Found 33 papers, 12 papers with code

Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

no code implementations11 Feb 2024 Willem Röpke, Mathieu Reymond, Patrick Mannion, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu

A significant challenge in multi-objective reinforcement learning is obtaining a Pareto front of policies that attain optimal performance under different preferences.

Multi-Objective Reinforcement Learning reinforcement-learning

Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

no code implementations5 Feb 2024 Peter Vamplew, Cameron Foale, Conor F. Hayes, Patrick Mannion, Enda Howley, Richard Dazeley, Scott Johnson, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Willem Röpke, Diederik M. Roijers

Research in multi-objective reinforcement learning (MORL) has introduced the utility-based paradigm, which makes use of both environmental rewards and a function that defines the utility derived by the user from those rewards.

Multi-Objective Reinforcement Learning reinforcement-learning

What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization

no code implementations19 Nov 2023 Zuzanna Osika, Jazmin Zatarain Salazar, Diederik M. Roijers, Frans A. Oliehoek, Pradeep K. Murukannaiah

We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms.

Ethics

Distributional Multi-Objective Decision Making

1 code implementation9 May 2023 Willem Röpke, Conor F. Hayes, Patrick Mannion, Enda Howley, Ann Nowé, Diederik M. Roijers

For effective decision support in scenarios with conflicting objectives, sets of potentially optimal solutions can be presented to the decision maker.

Decision Making

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

no code implementations6 Mar 2023 Raphael Avalos, Florent Delgrange, Ann Nowé, Guillermo A. Pérez, Diederik M. Roijers

Maintaining a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is often intractable.

Sample-Efficient Multi-Objective Learning via Generalized Policy Improvement Prioritization

2 code implementations18 Jan 2023 Lucas N. Alegre, Ana L. C. Bazzan, Diederik M. Roijers, Ann Nowé, Bruno C. da Silva

Finally, we introduce a bound that characterizes the maximum utility loss (with respect to the optimal solution) incurred by the partial solutions computed by our method throughout learning.

Active Learning Multi-Objective Reinforcement Learning

Determining Accessible Sidewalk Width by Extracting Obstacle Information from Point Clouds

1 code implementation8 Nov 2022 Cláudia Fonseca Pinhão, Chris Eijgenstein, Iva Gornishka, Shayla Jansen, Diederik M. Roijers, Daan Bloembergen

Obstacles on the sidewalk often block the path, limiting passage and resulting in frustration and wasted time, especially for citizens and visitors who use assistive devices (wheelchairs, walkers, strollers, canes, etc).

Exploring the Pareto front of multi-objective COVID-19 mitigation policies using reinforcement learning

no code implementations11 Apr 2022 Mathieu Reymond, Conor F. Hayes, Lander Willem, Roxana Rădulescu, Steven Abrams, Diederik M. Roijers, Enda Howley, Patrick Mannion, Niel Hens, Ann Nowé, Pieter Libin

As decision making in the context of epidemic mitigation is hard, reinforcement learning provides a methodology to automatically learn prevention strategies in combination with complex epidemic models.

Decision Making Multi-Objective Reinforcement Learning +1

Local Advantage Networks for Cooperative Multi-Agent Reinforcement Learning

no code implementations23 Dec 2021 Raphaël Avalos, Mathieu Reymond, Ann Nowé, Diederik M. Roijers

Many recent successful off-policy multi-agent reinforcement learning (MARL) algorithms for cooperative partially observable environments focus on finding factorized value functions, leading to convoluted network structures.

reinforcement-learning Reinforcement Learning (RL) +3

Preference Communication in Multi-Objective Normal-Form Games

1 code implementation17 Nov 2021 Willem Röpke, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu

We consider preference communication in two-player multi-objective normal-form games.

Opponent Learning Awareness and Modelling in Multi-Objective Normal Form Games

1 code implementation14 Nov 2020 Roxana Rădulescu, Timothy Verstraeten, Yijie Zhang, Patrick Mannion, Diederik M. Roijers, Ann Nowé

We contribute novel actor-critic and policy gradient formulations to allow reinforcement learning of mixed strategies in this setting, along with extensions that incorporate opponent policy reconstruction and learning with opponent learning awareness (i. e., learning while considering the impact of one's policy when anticipating the opponent's learning step).

Time Efficiency in Optimization with a Bayesian-Evolutionary Algorithm

no code implementations4 May 2020 Gongjin Lan, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben

Evolutionary Algorithms (EA) on the other hand rely on search heuristics that typically do not depend on all previous data and can be done in constant time.

Bayesian Optimization Evolutionary Algorithms

A utility-based analysis of equilibria in multi-objective normal form games

no code implementations17 Jan 2020 Roxana Rădulescu, Patrick Mannion, Yijie Zhang, Diederik M. Roijers, Ann Nowé

In multi-objective multi-agent systems (MOMAS), agents explicitly consider the possible tradeoffs between conflicting objective functions.

Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping

no code implementations15 Jan 2020 Eugenio Bargiacchi, Timothy Verstraeten, Diederik M. Roijers, Ann Nowé

We present a new model-based reinforcement learning algorithm, Cooperative Prioritized Sweeping, for efficient learning in multi-agent Markov decision processes.

Model-based Reinforcement Learning Multi-agent Reinforcement Learning +3

Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures

1 code implementation22 Nov 2019 Timothy Verstraeten, Eugenio Bargiacchi, Pieter JK Libin, Jan Helsen, Diederik M. Roijers, Ann Nowé

In this task, wind turbines must coordinate their alignments with respect to the incoming wind vector in order to optimize power production.

Thompson Sampling

Multi-Objective Multi-Agent Decision Making: A Utility-based Analysis and Survey

no code implementations6 Sep 2019 Roxana Rădulescu, Patrick Mannion, Diederik M. Roijers, Ann Nowé

We develop a new taxonomy which classifies multi-objective multi-agent decision making settings, on the basis of the reward structures, and which and how utility functions are applied.

Decision Making

The Actor-Advisor: Policy Gradient With Off-Policy Advice

no code implementations7 Feb 2019 Hélène Plisnier, Denis Steckelmacher, Diederik M. Roijers, Ann Nowé

In this paper, we propose an elegant solution, the Actor-Advisor architecture, in which a Policy Gradient actor learns from unbiased Monte-Carlo returns, while being shaped (or advised) by the Softmax policy arising from an off-policy critic.

Transfer Learning

Dynamic Weights in Multi-Objective Deep Reinforcement Learning

3 code implementations20 Sep 2018 Axel Abels, Diederik M. Roijers, Tom Lenaerts, Ann Nowé, Denis Steckelmacher

In the dynamic weights setting the relative importance changes over time and specialized algorithms that deal with such change, such as a tabular Reinforcement Learning (RL) algorithm by Natarajan and Tadepalli (2005), are required.

Multi-Objective Reinforcement Learning reinforcement-learning

Directed Policy Gradient for Safe Reinforcement Learning with Human Advice

no code implementations13 Aug 2018 Hélène Plisnier, Denis Steckelmacher, Tim Brys, Diederik M. Roijers, Ann Nowé

Our technique, Directed Policy Gradient (DPG), allows a teacher or backup policy to override the agent before it acts undesirably, while allowing the agent to leverage human advice or directives to learn faster.

reinforcement-learning Reinforcement Learning (RL) +1

Ordered Preference Elicitation Strategies for Supporting Multi-Objective Decision Making

1 code implementation21 Feb 2018 Luisa M. Zintgraf, Diederik M. Roijers, Sjoerd Linders, Catholijn M. Jonker, Ann Nowé

We build on previous work on Gaussian processes and pairwise comparisons for preference modelling, extend it to the multi-objective decision support scenario, and propose new ordered preference elicitation strategies based on ranking and clustering.

Clustering Decision Making +1

Bayesian Best-Arm Identification for Selecting Influenza Mitigation Strategies

no code implementations16 Nov 2017 Pieter Libin, Timothy Verstraeten, Diederik M. Roijers, Jelena Grujic, Kristof Theys, Philippe Lemey, Ann Nowé

We evaluate these algorithms in a realistic experimental setting and demonstrate that it is possible to identify the optimal strategy using only a limited number of model evaluations, i. e., 2-to-3 times faster compared to the uniform sampling method, the predominant technique used for epidemiological decision making in the literature.

Decision Making Thompson Sampling

Multi-Objective Deep Reinforcement Learning

2 code implementations9 Oct 2016 Hossam Mossalam, Yannis M. Assael, Diederik M. Roijers, Shimon Whiteson

We propose Deep Optimistic Linear Support Learning (DOL) to solve high-dimensional multi-objective decision problems where the relative importances of the objectives are not known a priori.

Multi-Objective Reinforcement Learning reinforcement-learning

Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

no code implementations29 Nov 2015 Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.

Decision Making Decision Making Under Uncertainty

Cannot find the paper you are looking for? You can Submit a new open access paper.