no code implementations • 1 Nov 2024 • Charlie B. Tan, Edan Toledo, Benjamin Ellis, Jakob N. Foerster, Ferenc Huszár

Using this insight we propose outer proximal policy optimization (outer-PPO); a framework wherein these update vectors are applied using an arbitrary gradient-based optimizer.

1 code implementation • 16 Sep 2024 • Sebastian Towers, Aleksandra Kalisz, Philippe A. Robert, Alicia Higueruelo, Francesca Vianello, Ming-Han Chloe Tsai, Harrison Steel, Jakob N. Foerster

Within our simulations, we demonstrate that our shapers target both current and simulated future viral variants, outperforming the antibodies chosen in a myopic way.

1 code implementation • 18 Jun 2024 • Jarek Liesen, Chris Lu, Andrei Lupu, Jakob N. Foerster, Henning Sprekeler, Robert T. Lange

The initial approach was limited to toy problems and produced environments that did not transfer to unseen RL algorithms.

no code implementations • 5 Jun 2024 • Tim Franzmeyer, Aleksandar Shtedritski, Samuel Albanie, Philip Torr, João F. Henriques, Jakob N. Foerster

Verifying whether an X note is helpful or whether a Wikipedia edit should be accepted are hard tasks that require grounding by querying the web.

1 code implementation • NeurIPS 2023 • Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson

In this work, we conduct new analysis demonstrating that SMAC lacks the stochasticity and partial observability to require complex *closed-loop* policies.

no code implementations • 28 Oct 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Gerald Tesauro, Jonathan P. How

By directly comparing active equilibria to Nash equilibria in these examples, we find that active equilibria find more effective solutions than Nash equilibria, concluding that an active equilibrium is the desired solution for multiagent learning settings.

no code implementations • 11 Oct 2022 • Isaac Liao, Rumen R. Dangovski, Jakob N. Foerster, Marin Soljačić

This paper introduces a novel machine learning optimizer called LODO, which tries to online meta-learn the best preconditioner during optimization.

no code implementations • 20 Jul 2022 • Tim Franzmeyer, Stephen Mcaleer, João F. Henriques, Jakob N. Foerster, Philip H. S. Torr, Adel Bibi, Christian Schroeder de Witt

Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs.

no code implementations • NeurIPS 2021 • Brandon Cui, Hengyuan Hu, Luis Pineda, Jakob N. Foerster

The standard problem setting in cooperative multi-agent settings is self-play (SP), where the goal is to train a team of agents that works well together.

no code implementations • 13 Jul 2022 • Hengyuan Hu, Samuel Sokota, David Wu, Anton Bakhtin, Andrei Lupu, Brandon Cui, Jakob N. Foerster

Fully cooperative, partially observable multi-agent problems are ubiquitous in the real world.

1 code implementation • 7 Mar 2022 • Dong-Ki Kim, Matthew Riemer, Miao Liu, Jakob N. Foerster, Michael Everett, Chuangchuang Sun, Gerald Tesauro, Jonathan P. How

An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and influence the evolution of future policies towards desirable behavior for its own benefit.

1 code implementation • 11 Feb 2020 • Dmitrii Beloborodov, A. E. Ulanov, Jakob N. Foerster, Shimon Whiteson, A. I. Lvovsky

Quantum hardware and quantum-inspired algorithms are becoming increasingly popular for combinatorial optimization.

4 code implementations • ICLR 2020 • Hengyuan Hu, Jakob N. Foerster

Learning to be informative when observed by others is an interesting challenge for Reinforcement Learning (RL): Fundamentally, RL requires agents to explore in order to discover good policies.

2 code implementations • 23 Oct 2019 • Reda Bahi Slaoui, William R. Clements, Jakob N. Foerster, Sébastien Toth

Producing agents that can generalize to a wide range of visually different environments is a significant challenge in reinforcement learning.

no code implementations • 25 Sep 2019 • Reda Bahi Slaoui, William R. Clements, Jakob N. Foerster, Sébastien Toth

In this work, we formalize the domain randomization problem, and show that minimizing the policy's Lipschitz constant with respect to the randomization parameters leads to low variance in the learned policies.

2 code implementations • 9 Sep 2019 • Thomas D. Barrett, William R. Clements, Jakob N. Foerster, A. I. Lvovsky

Our approach of exploratory combinatorial optimization (ECO-DQN) is, in principle, applicable to any combinatorial problem that can be defined on a graph.

1 code implementation • 1 Feb 2019 • Nolan Bard, Jakob N. Foerster, Sarath Chandar, Neil Burch, Marc Lanctot, H. Francis Song, Emilio Parisotto, Vincent Dumoulin, Subhodeep Moitra, Edward Hughes, Iain Dunning, Shibl Mourad, Hugo Larochelle, Marc G. Bellemare, Michael Bowling

From the early days of computing, games have been important testbeds for studying how well machines can do sophisticated decision making.

1 code implementation • 4 Nov 2018 • Jakob N. Foerster, Francis Song, Edward Hughes, Neil Burch, Iain Dunning, Shimon Whiteson, Matthew Botvinick, Michael Bowling

We present the Bayesian action decoder (BAD), a new multi-agent learning method that uses an approximate Bayesian update to obtain a public belief that conditions on the actions taken by all agents in the environment.

1 code implementation • NeurIPS 2019 • Christian A. Schroeder de Witt, Jakob N. Foerster, Gregory Farquhar, Philip H. S. Torr, Wendelin Boehmer, Shimon Whiteson

In this paper, we show that common knowledge between agents allows for complex decentralised coordination.

Multi-agent Reinforcement Learning
reinforcement-learning
**+4**

6 code implementations • 13 Sep 2017 • Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

We also show that the LOLA update rule can be efficiently calculated using an extension of the policy gradient estimator, making the method suitable for model-free RL.

no code implementations • ICML 2017 • Jakob N. Foerster, Justin Gilmer, Jan Chorowski, Jascha Sohl-Dickstein, David Sussillo

There exist many problem domains where the interpretability of neural network models is essential for deployment.

3 code implementations • NeurIPS 2016 • Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson

We consider the problem of multiple agents sensing and acting in environments with the goal of maximising their shared utility.

no code implementations • 8 Feb 2016 • Jakob N. Foerster, Yannis M. Assael, Nando de Freitas, Shimon Whiteson

We propose deep distributed recurrent Q-networks (DDRQN), which enable teams of agents to learn to solve communication-based coordination tasks.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.