Search Results for author: Frans A. Oliehoek

Found 44 papers, 11 papers with code

Policy Space Response Oracles: A Survey

no code implementations4 Mar 2024 Ariyan Bighashdel, Yongzhao Wang, Stephen Mcaleer, Rahul Savani, Frans A. Oliehoek

In game theory, a game refers to a model of interaction among rational decision-makers or players, making choices with the goal of achieving their individual objectives.

Position

What Lies beyond the Pareto Front? A Survey on Decision-Support Methods for Multi-Objective Optimization

no code implementations19 Nov 2023 Zuzanna Osika, Jazmin Zatarain Salazar, Diederik M. Roijers, Frans A. Oliehoek, Pradeep K. Murukannaiah

We present a review that unifies decision-support methods for exploring the solutions produced by multi-objective optimization (MOO) algorithms.

Ethics

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

no code implementations4 Jun 2023 Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek

In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.

What model does MuZero learn?

no code implementations1 Jun 2023 Jinke He, Thomas M. Moerland, Frans A. Oliehoek

Model-based reinforcement learning has drawn considerable interest in recent years, given its promise to improve sample efficiency.

Model-based Reinforcement Learning reinforcement-learning

Towards a Unifying Model of Rationality in Multiagent Systems

no code implementations29 May 2023 Robert Loftin, Mustafa Mert Çelikok, Frans A. Oliehoek

Multiagent systems deployed in the real world need to cooperate with other agents (including humans) nearly as effectively as these agents cooperate with one another.

Safe Multi-agent Learning via Trapping Regions

no code implementations27 Feb 2023 Aleksander Czechowski, Frans A. Oliehoek

One of the main challenges of multi-agent learning lies in establishing convergence of the algorithms, as, in general, a collection of individual, self-serving agents is not guaranteed to converge with their joint policy, when learning concurrently.

Generative Adversarial Network

Uncoupled Learning of Differential Stackelberg Equilibria with Commitments

no code implementations7 Feb 2023 Robert Loftin, Mustafa Mert Çelikok, Herke van Hoof, Samuel Kaski, Frans A. Oliehoek

A natural solution concept for many multiagent settings is the Stackelberg equilibrium, under which a ``leader'' agent selects a strategy that maximizes its own payoff assuming the ``follower'' chooses their best response to this strategy.

Multi-agent Reinforcement Learning

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

1 code implementation1 Jul 2022 Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.

On the Impossibility of Learning to Cooperate with Adaptive Partner Strategies in Repeated Games

no code implementations20 Jun 2022 Robert Loftin, Frans A. Oliehoek

Learning to cooperate with other agents is challenging when those agents also possess the ability to adapt to our own behavior.

BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs

no code implementations17 Feb 2022 Sammie Katt, Hai Nguyen, Frans A. Oliehoek, Christopher Amato

Under this parameterization, in contrast to previous work, the belief over the state and dynamics is a more scalable inference problem.

Reinforcement Learning (RL)

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

no code implementations3 Feb 2022 Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).

Reinforcement Learning (RL)

Online Planning in POMDPs with Self-Improving Simulators

1 code implementation27 Jan 2022 Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A. Oliehoek

To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation, based on a statistic that measures the accuracy of the approximate simulator.

MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced Active Learning

1 code implementation30 Dec 2021 Markus Peschl, Arkady Zgonnikov, Frans A. Oliehoek, Luciano C. Siebert

Inferring reward functions from demonstrations and pairwise preferences are auspicious approaches for aligning Reinforcement Learning (RL) agents with human intentions.

Active Learning Ethics +1

Multi-Agent MDP Homomorphic Networks

1 code implementation ICLR 2022 Elise van der Pol, Herke van Hoof, Frans A. Oliehoek, Max Welling

This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information, yet is able to share experience between global symmetries in the joint state-action space of cooperative multi-agent systems.

Difference Rewards Policy Gradients

no code implementations21 Dec 2020 Jacopo Castellini, Sam Devlin, Frans A. Oliehoek, Rahul Savani

Policy gradient methods have become one of the most popular classes of algorithms for multi-agent reinforcement learning.

counterfactual Multi-agent Reinforcement Learning +2

Analog Circuit Design with Dyna-Style Reinforcement Learning

no code implementations16 Nov 2020 Wook Lee, Frans A. Oliehoek

One of the aspects that makes this problem challenging to optimize, is that measuring the performance of candidate configurations with simulation can be computationally expensive, particularly in the post-layout design.

Layout Design Model-based Reinforcement Learning +2

Loss Bounds for Approximate Influence-Based Abstraction

1 code implementation3 Nov 2020 Elena Congeduti, Alexander Mey, Frans A. Oliehoek

Sequential decision making techniques hold great promise to improve the performance of many real-world systems, but computational complexity hampers their principled application.

Decision Making

Multi-agent active perception with prediction rewards

1 code implementation NeurIPS 2020 Mikko Lauri, Frans A. Oliehoek

The accuracy is quantified by a centralized prediction reward determined by a centralized decision-maker who perceives the observations gathered by all agents after the task ends.

Influence-Augmented Online Planning for Complex Environments

1 code implementation NeurIPS 2020 Jinke He, Miguel Suau, Frans A. Oliehoek

In this work, we propose influence-augmented online planning, a principled method to transform a factored simulator of the entire environment into a local simulator that samples only the state variables that are most relevant to the observation and reward of the planning agent and captures the incoming influence from the rest of the environment using machine learning methods.

Exploiting Submodular Value Functions For Scaling Up Active Perception

no code implementations21 Sep 2020 Yash Satsangi, Shimon Whiteson, Frans A. Oliehoek, Matthijs T. J. Spaan

Furthermore, we show that, under certain conditions, including submodularity, the value function computed using greedy PBVI is guaranteed to have bounded error with respect to the optimal value function.

Sensor Data for Human Activity Recognition: Feature Representation and Benchmarking

no code implementations15 May 2020 Flávia Alves, Martin Gairing, Frans A. Oliehoek, Thanh-Toan Do

In HAR, the development of Activity Recognition models is dependent upon the data captured by these devices and the methods used to analyse them, which directly affect performance metrics.

Benchmarking Human Activity Recognition

Mimicking Evolution with Reinforcement Learning

no code implementations NeurIPS 2021 João P. Abrantes, Arnaldo J. Abrantes, Frans A. Oliehoek

This work proposes Evolution via Evolutionary Reward (EvER) that allows learning to single-handedly drive the search for policies with increasingly evolutionary fitness by ensuring the alignment of the reward function with the fitness function.

Evolutionary Algorithms reinforcement-learning +1

Decentralized MCTS via Learned Teammate Models

no code implementations19 Mar 2020 Aleksander Czechowski, Frans A. Oliehoek

Decentralized online planning can be an attractive paradigm for cooperative multi-agent systems, due to improved scalability and robustness.

Plannable Approximations to MDP Homomorphisms: Equivariance under Actions

1 code implementation27 Feb 2020 Elise van der Pol, Thomas Kipf, Frans A. Oliehoek, Max Welling

We introduce a contrastive loss function that enforces action equivariance on the learned representations.

Representation Learning

A Sufficient Statistic for Influence in Structured Multiagent Environments

no code implementations22 Jul 2019 Frans A. Oliehoek, Stefan Witwicki, Leslie P. Kaelbling

In these ways, this paper deepens our understanding of abstraction in a wide range of sequential decision making settings, providing the basis for new approaches and algorithms for a large class of problems.

Decision Making

Learning from Demonstration in the Wild

no code implementations8 Nov 2018 Feryal Behbahani, Kyriacos Shiarlis, Xi Chen, Vitaly Kurin, Sudhanshu Kasewa, Ciprian Stirbu, João Gomes, Supratik Paul, Frans A. Oliehoek, João Messias, Shimon Whiteson

Learning from demonstration (LfD) is useful in settings where hand-coding behaviour or a reward function is impractical.

Beyond Local Nash Equilibria for Adversarial Networks

no code implementations18 Jun 2018 Frans A. Oliehoek, Rahul Savani, Jose Gallego, Elise van der Pol, Roderich Groß

Save for some special cases, current training methods for Generative Adversarial Networks (GANs) are at best guaranteed to converge to a `local Nash equilibrium` (LNE).

Learning in POMDPs with Monte Carlo Tree Search

no code implementations ICML 2017 Sammie Katt, Frans A. Oliehoek, Christopher Amato

The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult.

GANGs: Generative Adversarial Network Games

no code implementations2 Dec 2017 Frans A. Oliehoek, Rahul Savani, Jose Gallego-Posada, Elise van der Pol, Edwin D. de Jong, Roderich Gross

We introduce Generative Adversarial Network Games (GANGs), which explicitly model a finite zero-sum game between a generator ($G$) and classifier ($C$) that use mixed strategies.

Generative Adversarial Network

Scaling POMDPs For Selecting Sellers in E-markets-Extended Version

no code implementations30 Nov 2015 Athirai A. Irissappane, Frans A. Oliehoek, Jie Zhang

In multiagent e-marketplaces, buying agents need to select good sellers by querying other buyers (called advisors).

Solving Transition-Independent Multi-agent MDPs with Sparse Interactions (Extended version)

no code implementations29 Nov 2015 Joris Scharpff, Diederik M. Roijers, Frans A. Oliehoek, Matthijs T. J. Spaan, Mathijs M. de Weerdt

In cooperative multi-agent sequential decision making under uncertainty, agents must coordinate to find an optimal joint policy that maximises joint value.

Decision Making Decision Making Under Uncertainty

Exploiting Anonymity in Approximate Linear Programming: Scaling to Large Multiagent MDPs (Extended Version)

no code implementations29 Nov 2015 Philipp Robbel, Frans A. Oliehoek, Mykel J. Kochenderfer

We present an approach to mitigate this limitation for certain types of multiagent systems, exploiting a property that can be thought of as "anonymous influence" in the factored MDP.

Influence-Optimistic Local Values for Multiagent Planning --- Extended Version

no code implementations18 Feb 2015 Frans A. Oliehoek, Matthijs T. J. Spaan, Stefan Witwicki

Recent years have seen the development of methods for multiagent planning under uncertainty that scale to tens or even hundreds of agents.

Benchmarking

Scalable Planning and Learning for Multiagent POMDPs: Extended Version

1 code implementation4 Apr 2014 Christopher Amato, Frans A. Oliehoek

Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces.

reinforcement-learning Reinforcement Learning (RL)

Exploiting Agent and Type Independence in Collaborative Graphical Bayesian Games

no code implementations1 Aug 2011 Frans A. Oliehoek, Shimon Whiteson, Matthijs T. J. Spaan

Such problems can be modeled as collaborative Bayesian games in which each agent receives private information in the form of its type.

Decision Making Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.