Search Results for author: Chris Lu

Found 20 papers, 14 papers with code

Revisiting Recurrent Reinforcement Learning with Memory Monoids

1 code implementation • 15 Feb 2024 • Steven Morad, Chris Lu, Ryan Kortvelesy, Stephan Liwicki, Jakob Foerster, Amanda Prorok

Memory models such as Recurrent Neural Networks (RNNs) and Transformers address Partially Observable Markov Decision Processes (POMDPs) by mapping trajectories to latent Markov states.

reinforcement-learning

Paper
Code

Discovering Temporally-Aware Reinforcement Learning Algorithms

1 code implementation • 8 Feb 2024 • Matthew Thomas Jackson, Chris Lu, Louis Kirsch, Robert Tjarko Lange, Shimon Whiteson, Jakob Nicolaus Foerster

We propose a simple augmentation to two existing objective discovery approaches that allows the discovered algorithm to dynamically update its objective function throughout the agent's training procedure, resulting in expressive schedules and increased generalization across different training horizons.

Meta-Learning reinforcement-learning

Paper
Code

Analysing the Sample Complexity of Opponent Shaping

no code implementations • 8 Feb 2024 • Kitty Fung, Qizhen Zhang, Chris Lu, Jia Wan, Timon Willi, Jakob Foerster

Providing theoretical guarantees for M-FOS is hard because A) there is little literature on theoretical sample complexity bounds for meta-reinforcement learning B) M-FOS operates in continuous state and action spaces, so theoretical analysis is challenging.

Meta Reinforcement Learning

Paper
Add Code

Meta-learning the mirror map in policy mirror descent

no code implementations • 7 Feb 2024 • Carlo Alfano, Sebastian Towers, Silvia Sapora, Chris Lu, Patrick Rebeschini

Policy Mirror Descent (PMD) is a popular framework in reinforcement learning, serving as a unifying perspective that encompasses numerous algorithms.

Meta-Learning

Paper
Add Code

Leading the Pack: N-player Opponent Shaping

no code implementations • 19 Dec 2023 • Alexandra Souly, Timon Willi, Akbir Khan, Robert Kirk, Chris Lu, Edward Grefenstette, Tim Rocktäschel

We evaluate on over 4 different environments, varying the number of players from 3 to 5, and demonstrate that model-based OS methods converge to equilibrium with better global welfare than naive learning.

Paper
Add Code

Scaling Opponent Shaping to High Dimensional Games

no code implementations • 19 Dec 2023 • Akbir Khan, Timon Willi, Newton Kwan, Andrea Tacchetti, Chris Lu, Edward Grefenstette, Tim Rocktäschel, Jakob Foerster

In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes.

Meta-Learning

Paper
Add Code

JaxMARL: Multi-Agent RL Environments in JAX

2 code implementations • 16 Nov 2023 • Alexander Rutherford, Benjamin Ellis, Matteo Gallici, Jonathan Cook, Andrei Lupu, Gardar Ingvarsson, Timon Willi, Akbir Khan, Christian Schroeder de Witt, Alexandra Souly, Saptarashmi Bandyopadhyay, Mikayel Samvelyan, Minqi Jiang, Robert Tjarko Lange, Shimon Whiteson, Bruno Lacerda, Nick Hawes, Tim Rocktaschel, Chris Lu, Jakob Nicolaus Foerster

This not only enables GPU acceleration, but also provides a more flexible MARL environment, unlocking the potential for self-play, meta-learning, and other future applications in MARL.

Meta-Learning Multi-agent Reinforcement Learning +3

325

Paper
Code

Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design

1 code implementation • NeurIPS 2023 • Matthew Thomas Jackson, Minqi Jiang, Jack Parker-Holder, Risto Vuorio, Chris Lu, Gregory Farquhar, Shimon Whiteson, Jakob Nicolaus Foerster

Recently, it has been shown that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks.

General Reinforcement Learning reinforcement-learning +1

Paper
Code

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

no code implementations • 25 Aug 2023 • Sascha Frey, Kang Li, Peer Nagy, Silvia Sapora, Chris Lu, Stefan Zohren, Jakob Foerster, Anisoara Calinescu

Financial exchanges across the world use limit order books (LOBs) to process orders and match trades.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

ReLU to the Rescue: Improve Your On-Policy Actor-Critic with Positive Advantages

1 code implementation • 2 Jun 2023 • Andrew Jesson, Chris Lu, Gunshi Gupta, Angelos Filos, Jakob Nicolaus Foerster, Yarin Gal

We show that the additive term is bounded proportional to the Lipschitz constant of the value function, which offers theoretical grounding for spectral normalization of critic weights.

Bayesian Inference Continuous Control +3

Paper
Code

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

1 code implementation • 8 Apr 2023 • Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag

Genetic algorithms constitute a family of black-box optimization algorithms, which take inspiration from the principles of biological evolution.

444

Paper
Code

Arbitrary Order Meta-Learning with Simple Population-Based Evolution

no code implementations • 16 Mar 2023 • Chris Lu, Sebastian Towers, Jakob Foerster

Meta-learning, the notion of learning to learn, enables learning systems to quickly and flexibly solve new tasks.

Meta-Learning Time Series +1

Paper
Add Code

Structured State Space Models for In-Context Reinforcement Learning

2 code implementations • NeurIPS 2023 • Chris Lu, Yannick Schroecker, Albert Gu, Emilio Parisotto, Jakob Foerster, Satinder Singh, Feryal Behbahani

We propose a modification to a variant of S4 that enables us to initialise and reset the hidden state in parallel, allowing us to tackle reinforcement learning tasks.

Continuous Control Meta-Learning +1

Paper
Code

Discovering Evolution Strategies via Meta-Black-Box Optimization

1 code implementation • 21 Nov 2022 • Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh, Sebastian Flennerhag

Optimizing functions without access to gradients is the remit of black-box methods such as evolution strategies.

Continuous Control Meta-Learning

444

Paper
Code

Adversarial Cheap Talk

1 code implementation • 20 Nov 2022 • Chris Lu, Timon Willi, Alistair Letcher, Jakob Foerster

More specifically, we show that an ACT Adversary is capable of harming performance by interfering with the learner's function approximation, or instead helping the Victim's performance by outputting useful features.

Meta-Learning Reinforcement Learning (RL)

554

Paper
Code

Proximal Learning With Opponent-Learning Awareness

1 code implementation • 18 Oct 2022 • Stephen Zhao, Chris Lu, Roger Baker Grosse, Jakob Nicolaus Foerster

This problem is especially pronounced in the opponent modeling setting, where the opponent's policy is unknown and must be inferred from observations; in such settings, LOLA is ill-specified because behaviorally equivalent opponent policies can result in non-equivalent updates.

Multi-agent Reinforcement Learning

Paper
Code

Discovered Policy Optimisation

1 code implementation • 11 Oct 2022 • Chris Lu, Jakub Grudzien Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, Jakob Foerster

We refer to the immediate result as Learnt Policy Optimisation (LPO).

Meta-Learning Reinforcement Learning (RL)

554

Paper
Code

Model-Free Opponent Shaping

2 code implementations • 3 May 2022 • Chris Lu, Timon Willi, Christian Schroeder de Witt, Jakob Foerster

In general-sum games, the interaction of self-interested learning agents commonly leads to collectively worst-case outcomes, such as defect-defect in the iterated prisoner's dilemma (IPD).

554

Paper
Code

Centralized Model and Exploration Policy for Multi-Agent RL

1 code implementation • 14 Jul 2021 • Qizhen Zhang, Chris Lu, Animesh Garg, Jakob Foerster

We also learn a centralized exploration policy within our model that learns to collect additional data in state-action regions with high model uncertainty.

Reinforcement Learning (RL)

Paper
Code

Learning to Control Self-Assembling Morphologies: A Study of Generalization via Modularity

1 code implementation • NeurIPS 2019 • Deepak Pathak, Chris Lu, Trevor Darrell, Phillip Isola, Alexei A. Efros

We evaluate the performance of these dynamic and modular agents in simulated environments.

111

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.