no code implementations • 28 Nov 2023 • Daniel Bairamian, Philippe Marcotte, Joshua Romoff, Gabriel Robert, Derek Nowrouzezahrai
In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency.
1 code implementation • 27 Oct 2023 • Roger Creus Castanyer, Joshua Romoff, Glen Berseth
Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent.
no code implementations • 18 Aug 2023 • Anthony Kobanda, Valliappan C. A., Joshua Romoff, Ludovic Denoyer
Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes.
no code implementations • 22 Dec 2021 • Edward Beeching, Maxim Peter, Philippe Marcotte, Jilles Debangoye, Olivier Simonin, Joshua Romoff, Christian Wolf
We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions.
1 code implementation • 22 Dec 2021 • Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal
The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors.
no code implementations • 9 Nov 2020 • Eloi Alonso, Maxim Peter, David Goumard, Joshua Romoff
We test our approach on complex 3D environments in the Unity game engine that are notably an order of magnitude larger than maps typically used in the Deep RL literature.
no code implementations • 6 Jul 2020 • Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau
We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers.
2 code implementations • 31 Jan 2020 • Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, Joelle Pineau
Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research.
1 code implementation • NeurIPS 2019 • Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat
We show that we can run several loosely coupled GALA agents in parallel on a single GPU and achieve significantly higher hardware utilization and frame-rates than vanilla A2C at comparable power draws.
1 code implementation • 5 Feb 2019 • Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier
In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.
no code implementations • ICLR 2019 • Abhishek Das, Théophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Michael Rabbat, Joelle Pineau
We propose a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments.
1 code implementation • 5 Oct 2018 • Peter Henderson, Joshua Romoff, Joelle Pineau
We find that adaptive optimizers have a narrow window of effective learning rates, diverging in other cases, and that the effectiveness of momentum varies depending on the properties of the environment.
1 code implementation • 6 Jun 2018 • Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent
In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.
1 code implementation • 9 May 2018 • Joshua Romoff, Peter Henderson, Alexandre Piché, Vincent Francois-Lavet, Joelle Pineau
However, introduction of corrupt or stochastic rewards can yield high variance in learning.
1 code implementation • NeurIPS 2017 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, Jeffrey Tsang
One of the main challenges in reinforcement learning (RL) is generalisation.
no code implementations • ICLR 2018 • Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen
We consider tackling a single-agent RL problem by distributing it to $n$ learners.
no code implementations • 15 Dec 2016 • Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche
In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task.