1 code implementation • 28 May 2022 • Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett
By retrospectively deconstructing the search tree into multiple paths each contained within a sub-tree, we enable the agent to learn from shorter trajectories with more predictable next states.
1 code implementation • 27 May 2022 • Thomas D. Barrett, Christopher W. F. Parsonson, Alexandre Laterre
Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time.
no code implementations • 30 Oct 2021 • Clément Bonnet, Paul Caron, Thomas Barrett, Ian Davies, Alexandre Laterre
Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning.
no code implementations • 29 Sep 2021 • Thomas D Barrett, Christopher William Falke Parsonson, Alexandre Laterre
Compared to the nearest competitor, ECORD reduces the optimality gap by up to 73% on 500 vertex graphs with a decreased wall-clock time.
1 code implementation • 3 Jul 2021 • Arnu Pretorius, Kale-ab Tessera, Andries P. Smit, Claude Formanek, St John Grimbly, Kevin Eloff, Siphelele Danisa, Lawrence Francis, Jonathan Shock, Herman Kamper, Willie Brink, Herman Engelbrecht, Alexandre Laterre, Karim Beguir
We provide experimental results for these implementations on a wide range of multi-agent environments and highlight the benefits of distributed system training.
no code implementations • 1 Jan 2021 • Arnu Pretorius, Scott Cameron, Andries Petrus Smit, Elan van Biljon, Lawrence Francis, Femi Azeez, Alexandre Laterre, Karim Beguir
Furthermore, the core utility of our imagination is deeply coupled with communication.
no code implementations • 1 Jan 2021 • Thomas Pierrot, Valentin Macé, Jean-Baptiste Sevestre, Louis Monier, Alexandre Laterre, Nicolas Perrin, Karim Beguir, Olivier Sigaud
Very large action spaces constitute a critical challenge for deep Reinforcement Learning (RL) algorithms.
no code implementations • 3 Dec 2020 • Marcin J. Skwark, Nicolás López Carranza, Thomas Pierrot, Joe Phillips, Slim Said, Alexandre Laterre, Amine Kerkeni, Uğur Şahin, Karim Beguir
This suggests that combining leading protein design methods with modern deep reinforcement learning is a viable path for discovering a Covid-19 cure and may accelerate design of peptide-based therapeutics for other diseases.
no code implementations • 29 Nov 2020 • Louis Monier, Jakub Kmec, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir
Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment.
1 code implementation • NeurIPS 2020 • Arnu Pretorius, Scott Cameron, Elan van Biljon, Tom Makkink, Shahil Mawjee, Jeremy du Plessis, Jonathan Shock, Alexandre Laterre, Karim Beguir
Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control.
no code implementations • 27 Jul 2020 • Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas
Third, the self-models are harnessed to learn recursive compositional programs with multiple levels of abstraction.
1 code implementation • NeurIPS 2019 • Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, Alexandre Laterre, David Kas, Karim Beguir, Nando de Freitas
AlphaZero contributes powerful neural network guided search algorithms, which we augment with recursion.
2 code implementations • 4 Jul 2018 • Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir
Results from applying the R2 algorithm to instances of a two-dimensional and three-dimensional bin packing problems show that it outperforms generic Monte Carlo tree search, heuristic algorithms and integer programming solvers.