1 code implementation • NeurIPS 2023 • Cong Lu, Philip J. Ball, Yee Whye Teh, Jack Parker-Holder
We believe that synthetic training data could open the door to realizing the full potential of deep learning for replay-based RL algorithms from limited data.
2 code implementations • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine
Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).
no code implementations • 23 Oct 2022 • Yingchen Xu, Jack Parker-Holder, Aldo Pacchiano, Philip J. Ball, Oleh Rybkin, Stephen J. Roberts, Tim Rocktäschel, Edward Grefenstette
We then present CASCADE, a novel approach for self-supervised exploration in this new setting.
2 code implementations • 19 Jul 2022 • Xingchen Wan, Cong Lu, Jack Parker-Holder, Philip J. Ball, Vu Nguyen, Binxin Ru, Michael A. Osborne
Leveraging the new highly parallelizable Brax physics engine, we show that these innovations lead to large performance gains, significantly outperforming the tuned baseline while learning entire configurations on the fly.
1 code implementation • 3 Jul 2022 • Edoardo Cetin, Philip J. Ball, Steve Roberts, Oya Celiktutan
Off-policy reinforcement learning (RL) from pixel observations is notoriously unstable.
2 code implementations • 9 Jun 2022 • Cong Lu, Philip J. Ball, Tim G. J. Rudner, Jack Parker-Holder, Michael A. Osborne, Yee Whye Teh
Using this suite of benchmarking tasks, we show that simple modifications to two popular vision-based online reinforcement learning algorithms, DreamerV2 and DrQ-v2, suffice to outperform existing offline RL methods and establish competitive baselines for continuous control in the visual domain.
no code implementations • 8 Oct 2021 • Cong Lu, Philip J. Ball, Jack Parker-Holder, Michael A. Osborne, Stephen J. Roberts
Significant progress has been made recently in offline model-based reinforcement learning, approaches which leverage a learned dynamics model.
no code implementations • ICLR Workshop SSL-RL 2021 • Philip J. Ball, Cong Lu, Jack Parker-Holder, Stephen Roberts
Reinforcement learning from large-scale offline datasets provides us with the ability to learn policies without potentially unsafe or impractical exploration.
1 code implementation • 27 Jan 2021 • Philip J. Ball, Stephen J. Roberts
Two popular approaches to model-free continuous control tasks are SAC and TD3.
no code implementations • 28 Oct 2020 • Philip J. Ball, Yingzhen Li, Angus Lamb, Cheng Zhang
We study a setting where the pruning phase is given a time budget, and identify connections between iterative pruning and multiple sleep cycles in humans.
no code implementations • 21 Jun 2020 • Aldo Pacchiano, Philip J. Ball, Jack Parker-Holder, Krzysztof Choromanski, Stephen Roberts
The principle of optimism in the face of uncertainty is prevalent throughout sequential decision making problems such as multi-armed bandits and reinforcement learning (RL).
1 code implementation • 24 Sep 2019 • Noor Sajid, Philip J. Ball, Thomas Parr, Karl J. Friston
In this paper, we provide: 1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in RL; 2) an explicit discrete-state comparison between active inference and RL on an OpenAI gym baseline.
1 code implementation • 1 Jul 2019 • Niki Kilbertus, Philip J. Ball, Matt J. Kusner, Adrian Weller, Ricardo Silva
We demonstrate our new sensitivity analysis tools in real-world fairness scenarios to assess the bias arising from confounding.