Search Results for author: Miguel Suau

Found 7 papers, 4 papers with code

Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL

no code implementations4 Jun 2023 Miguel Suau, Matthijs T. J. Spaan, Frans A. Oliehoek

In this paper, we provide a mathematical characterization of this phenomenon, which we refer to as policy confounding, and show, through a series of examples, when and how it occurs in practice.

Distributed Influence-Augmented Local Simulators for Parallel MARL in Large Networked Systems

1 code implementation1 Jul 2022 Miguel Suau, Jinke He, Mustafa Mert Çelikok, Matthijs T. J. Spaan, Frans A. Oliehoek

Due to its high sample complexity, simulation is, as of today, critical for the successful application of reinforcement learning.

Influence-Augmented Local Simulators: A Scalable Solution for Fast Deep RL in Large Networked Systems

no code implementations3 Feb 2022 Miguel Suau, Jinke He, Matthijs T. J. Spaan, Frans A. Oliehoek

Learning effective policies for real-world problems is still an open challenge for the field of reinforcement learning (RL).

Reinforcement Learning (RL)

Online Planning in POMDPs with Self-Improving Simulators

1 code implementation27 Jan 2022 Jinke He, Miguel Suau, Hendrik Baier, Michael Kaisers, Frans A. Oliehoek

To plan reliably and efficiently while the approximate simulator is learning, we develop a method that adaptively decides which simulator to use for every simulation, based on a statistic that measures the accuracy of the approximate simulator.

Offline Contextual Bandits for Wireless Network Optimization

no code implementations11 Nov 2021 Miguel Suau, Alexandros Agapitos, David Lynch, Derek Farrell, Mingqi Zhou, Aleksandar Milenovic

The explosion in mobile data traffic together with the ever-increasing expectations for higher quality of service call for the development of AI algorithms for wireless network optimization.

Computational Efficiency Multi-Armed Bandits

Influence-Augmented Online Planning for Complex Environments

1 code implementation NeurIPS 2020 Jinke He, Miguel Suau, Frans A. Oliehoek

In this work, we propose influence-augmented online planning, a principled method to transform a factored simulator of the entire environment into a local simulator that samples only the state variables that are most relevant to the observation and reward of the planning agent and captures the incoming influence from the rest of the environment using machine learning methods.

Cannot find the paper you are looking for? You can Submit a new open access paper.