Search Results for author: Joshua Romoff

Found 17 papers, 9 papers with code

Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play

no code implementations • 28 Nov 2023 • Daniel Bairamian, Philippe Marcotte, Joshua Romoff, Gabriel Robert, Derek Nowrouzezahrai

In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency.

Atari Games Dota 2 +3

Paper
Add Code

Improving Intrinsic Exploration by Creating Stationary Objectives

1 code implementation • 27 Oct 2023 • Roger Creus Castanyer, Joshua Romoff, Glen Berseth

Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent.

451

Paper
Code

Learning Computational Efficient Bots with Costly Features

no code implementations • 18 Aug 2023 • Anthony Kobanda, Valliappan C. A., Joshua Romoff, Ludovic Denoyer

Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes.

Computational Efficiency D4RL +1

Paper
Add Code

Graph augmented Deep Reinforcement Learning in the GameRLand3D environment

no code implementations • 22 Dec 2021 • Edward Beeching, Maxim Peter, Philippe Marcotte, Jilles Debangoye, Olivier Simonin, Joshua Romoff, Christian Wolf

We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Direct Behavior Specification via Constrained Reinforcement Learning

1 code implementation • 22 Dec 2021 • Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal

The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors.

Continuous Control reinforcement-learning +1

Paper
Code

Deep Reinforcement Learning for Navigation in AAA Video Games

no code implementations • 9 Nov 2020 • Eloi Alonso, Maxim Peter, David Goumard, Joshua Romoff

We test our approach on complex 3D environments in the Unity game engine that are notably an order of magnitude larger than maps typically used in the Deep RL literature.

Navigate reinforcement-learning +2

Paper
Add Code

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

no code implementations • 6 Jul 2020 • Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau

We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers.

Paper
Add Code

Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

2 code implementations • 31 Jan 2020 • Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, Joelle Pineau

Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research.

BIG-bench Machine Learning reinforcement-learning +1

263

Paper
Code

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

1 code implementation • NeurIPS 2019 • Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat

We show that we can run several loosely coupled GALA agents in parallel on a single GPU and achieve significantly higher hardware utilization and frame-rates than vanilla A2C at comparable power draws.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Separating value functions across time-scales

1 code implementation • 5 Feb 2019 • Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier

In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.

Reinforcement Learning (RL)

Paper
Code

TarMAC: Targeted Multi-Agent Communication

no code implementations • ICLR 2019 • Abhishek Das, Théophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Michael Rabbat, Joelle Pineau

We propose a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments.

Multi-agent Reinforcement Learning

Paper
Add Code

Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods

1 code implementation • 5 Oct 2018 • Peter Henderson, Joshua Romoff, Joelle Pineau

We find that adaptive optimizers have a narrow window of effective learning rates, diverging in other cases, and that the effectiveness of momentum varies depending on the properties of the environment.

Continuous Control Policy Gradient Methods

Paper
Code

Randomized Value Functions via Multiplicative Normalizing Flows

1 code implementation • 6 Jun 2018 • Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent

In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.

Efficient Exploration Thompson Sampling