16 code implementations • ICML 2018 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson
At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted.
Ranked #1 on SMAC+ on Off_Near_parallel
Multi-agent Reinforcement Learning reinforcement-learning +4
20 code implementations • 11 Feb 2019 • Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim G. J. Rudner, Chia-Man Hung, Philip H. S. Torr, Jakob Foerster, Shimon Whiteson
In this paper, we propose the StarCraft Multi-Agent Challenge (SMAC) as a benchmark problem to fill this gap.
Ranked #6 on SMAC on SMAC 6h_vs_8z
no code implementations • 17 May 2019 • Christian Schroeder de Witt, Thomas Hornigold
As global greenhouse gas emissions continue to rise, the use of stratospheric aerosol injection (SAI), a form of solar geoengineering, is increasingly considered in order to artificially mitigate climate change effects.
no code implementations • pproximateinference AABI Symposium 2019 • Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Saeid Naderiparizi, Adam Scibior, Andreas Munk, Frank Wood, Mehrdad Ghadiri, Philip Torr, Yee Whye Teh, Atilim Gunes Baydin, Tom Rainforth
We introduce two approaches for conducting efficient Bayesian inference in stochastic simulators containing nested stochastic sub-procedures, i. e., internal procedures for which the density cannot be calculated directly such as rejection sampling loops.
1 code implementation • 20 Oct 2019 • Saeid Naderiparizi, Adam Ścibior, Andreas Munk, Mehrdad Ghadiri, Atılım Güneş Baydin, Bradley Gram-Hansen, Christian Schroeder de Witt, Robert Zinkov, Philip H. S. Torr, Tom Rainforth, Yee Whye Teh, Frank Wood
Naive approaches to amortized inference in probabilistic programs with unbounded loops can produce estimators with infinite variance.
1 code implementation • 19 Mar 2020 • Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson
At the same time, it is often possible to train the agents in a centralised fashion where global state information is available and communication constraints are lifted.
Ranked #6 on SMAC on SMAC 6h_vs_8z
2 code implementations • 14 May 2020 • Christian Schroeder de Witt, Bradley Gram-Hansen, Nantas Nardelli, Andrew Gambardella, Rob Zinkov, Puneet Dokania, N. Siddharth, Ana Belen Espinosa-Gonzalez, Ara Darzi, Philip Torr, Atılım Güneş Baydin
The COVID-19 pandemic has highlighted the importance of in-silico epidemiological modelling in predicting the dynamics of infectious diseases to inform health policy and decision makers about suitable prevention and containment strategies.
6 code implementations • 18 Nov 2020 • Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson
Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function.
no code implementations • 17 Dec 2020 • Christian Schroeder de Witt, Catherine Tong, Valentina Zantedeschi, Daniele De Martini, Freddie Kalaitzis, Matthew Chantry, Duncan Watson-Parris, Piotr Bilinski
Extreme precipitation events, such as violent rainfall and hail storms, routinely ravage economies and livelihoods around the developing world.
1 code implementation • 17 Jul 2021 • Samuel Sokota, Christian Schroeder de Witt, Maximilian Igl, Luisa Zintgraf, Philip Torr, Martin Strohmeier, J. Zico Kolter, Shimon Whiteson, Jakob Foerster
We contribute a theoretically grounded approach to MCGs based on maximum entropy reinforcement learning and minimum entropy coupling that we call MEME.
no code implementations • 23 Nov 2021 • Christian Schroeder de Witt, Yongchao Huang, Philip H. S. Torr, Martin Strohmeier
We then argue that attacker-defender fixed points are themselves general-sum games with complex phase transitions, and introduce a temporally extended multi-agent reinforcement learning framework in which the resultant dynamics can be studied.
1 code implementation • 7 Jan 2022 • Jakub Grudzien Kuba, Christian Schroeder de Witt, Jakob Foerster
In contrast, in this paper we introduce a novel theoretical framework, named Mirror Learning, which provides theoretical guarantees to a large class of algorithms, including TRPO and PPO.
2 code implementations • 3 May 2022 • Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster
In general-sum games, the interaction of self-interested learning agents commonly leads to collectively worst-case outcomes, such as defect-defect in the iterated prisoner's dilemma (IPD).
no code implementations • 28 May 2022 • Christian Schroeder de Witt
The correctness of our GA implementation is demonstrated using a test-bed fitness function, and our JaTAM implementation is verified by classifying a well-known search space $S_{2, 8}$ based on two tile types.
no code implementations • 26 Jun 2022 • Darius Muglich, Luisa Zintgraf, Christian Schroeder de Witt, Shimon Whiteson, Jakob Foerster
Self-play is a common paradigm for constructing solutions in Markov games that can yield optimal policies in collaborative settings.
no code implementations • 20 Jul 2022 • Tim Franzmeyer, Stephen Mcaleer, João F. Henriques, Jakob N. Foerster, Philip H. S. Torr, Adel Bibi, Christian Schroeder de Witt
Autonomous agents deployed in the real world need to be robust against adversarial attacks on sensory inputs.
1 code implementation • 11 Oct 2022 • Chris Lu, Jakub Grudzien Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, Jakob Foerster
We refer to the immediate result as Learnt Policy Optimisation (LPO).
1 code implementation • 21 Oct 2022 • Darius Muglich, Christian Schroeder de Witt, Elise van der Pol, Shimon Whiteson, Jakob Foerster
Successful coordination in Dec-POMDPs requires agents to adopt robust strategies and interpretable styles of play for their partner.
1 code implementation • 24 Oct 2022 • Christian Schroeder de Witt, Samuel Sokota, J. Zico Kolter, Jakob Foerster, Martin Strohmeier
Steganography is the practice of encoding secret information into innocuous content in such a manner that an adversarial third party would not realize that there is hidden meaning.
no code implementations • 20 Nov 2022 • Dylan Radovic, Lucas Kruitwagen, Christian Schroeder de Witt, Ben Caldecott, Shane Tomlinson, Mark Workman
The energy transition potentially poses an existential risk for major international oil companies (IOCs) if they fail to adapt to low-carbon business models.
no code implementations • 19 Mar 2023 • Yat Long Lo, Christian Schroeder de Witt, Samuel Sokota, Jakob Nicolaus Foerster, Shimon Whiteson
By enabling agents to communicate, recent cooperative multi-agent reinforcement learning (MARL) methods have demonstrated better task performance and more coordinated behavior.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 24 Aug 2023 • Mattie Fellows, Brandon Kaplowitz, Christian Schroeder de Witt, Shimon Whiteson
Empirical results demonstrate that BEN can learn true Bayes-optimal policies in tasks where existing model-free approaches fail.
2 code implementations • 16 Nov 2023 • Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster
This not only enables GPU acceleration, but also provides a more flexible MARL environment, unlocking the potential for self-play, meta-learning, and other future applications in MARL.
no code implementations • 12 Feb 2024 • Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt
In this paper, we comprehensively formalise the problem of secret collusion in systems of generative AI agents by drawing on relevant concepts from both the AI and security literature.