Search Results for author: Joshua Romoff

Found 17 papers, 9 papers with code

Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play

no code implementations28 Nov 2023 Daniel Bairamian, Philippe Marcotte, Joshua Romoff, Gabriel Robert, Derek Nowrouzezahrai

In this paper, we propose the Minimax Exploiter, a game theoretic approach to exploiting Main Agents that leverages knowledge of its opponents, leading to significant increases in data efficiency.

Atari Games Dota 2 +3

Improving Intrinsic Exploration by Creating Stationary Objectives

1 code implementation27 Oct 2023 Roger Creus Castanyer, Joshua Romoff, Glen Berseth

Several exploration objectives like count-based bonuses, pseudo-counts, and state-entropy maximization are non-stationary and hence are difficult to optimize for the agent.

Learning Computational Efficient Bots with Costly Features

no code implementations18 Aug 2023 Anthony Kobanda, Valliappan C. A., Joshua Romoff, Ludovic Denoyer

Deep reinforcement learning (DRL) techniques have become increasingly used in various fields for decision-making processes.

Computational Efficiency D4RL +1

Graph augmented Deep Reinforcement Learning in the GameRLand3D environment

no code implementations22 Dec 2021 Edward Beeching, Maxim Peter, Philippe Marcotte, Jilles Debangoye, Olivier Simonin, Joshua Romoff, Christian Wolf

We address planning and navigation in challenging 3D video games featuring maps with disconnected regions reachable by agents using special actions.

reinforcement-learning Reinforcement Learning (RL)

Direct Behavior Specification via Constrained Reinforcement Learning

1 code implementation22 Dec 2021 Julien Roy, Roger Girgis, Joshua Romoff, Pierre-Luc Bacon, Christopher Pal

The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors.

Continuous Control reinforcement-learning +1

Deep Reinforcement Learning for Navigation in AAA Video Games

no code implementations9 Nov 2020 Eloi Alonso, Maxim Peter, David Goumard, Joshua Romoff

We test our approach on complex 3D environments in the Unity game engine that are notably an order of magnitude larger than maps typically used in the Deep RL literature.

Navigate reinforcement-learning +2

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

no code implementations6 Jul 2020 Joshua Romoff, Peter Henderson, David Kanaa, Emmanuel Bengio, Ahmed Touati, Pierre-Luc Bacon, Joelle Pineau

We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers.

Gossip-based Actor-Learner Architectures for Deep Reinforcement Learning

1 code implementation NeurIPS 2019 Mahmoud Assran, Joshua Romoff, Nicolas Ballas, Joelle Pineau, Michael Rabbat

We show that we can run several loosely coupled GALA agents in parallel on a single GPU and achieve significantly higher hardware utilization and frame-rates than vanilla A2C at comparable power draws.

reinforcement-learning Reinforcement Learning (RL)

Separating value functions across time-scales

1 code implementation5 Feb 2019 Joshua Romoff, Peter Henderson, Ahmed Touati, Emma Brunskill, Joelle Pineau, Yann Ollivier

In settings where this bias is unacceptable - where the system must optimize for longer horizons at higher discounts - the target of the value function approximator may increase in variance leading to difficulties in learning.

Reinforcement Learning (RL)

TarMAC: Targeted Multi-Agent Communication

no code implementations ICLR 2019 Abhishek Das, Théophile Gervet, Joshua Romoff, Dhruv Batra, Devi Parikh, Michael Rabbat, Joelle Pineau

We propose a targeted communication architecture for multi-agent reinforcement learning, where agents learn both what messages to send and whom to address them to while performing cooperative tasks in partially-observable environments.

Multi-agent Reinforcement Learning

Where Did My Optimum Go?: An Empirical Analysis of Gradient Descent Optimization in Policy Gradient Methods

1 code implementation5 Oct 2018 Peter Henderson, Joshua Romoff, Joelle Pineau

We find that adaptive optimizers have a narrow window of effective learning rates, diverging in other cases, and that the effectiveness of momentum varies depending on the properties of the environment.

Continuous Control Policy Gradient Methods

Randomized Value Functions via Multiplicative Normalizing Flows

1 code implementation6 Jun 2018 Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent

In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.

Efficient Exploration Thompson Sampling

Separation of Concerns in Reinforcement Learning

no code implementations15 Dec 2016 Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche

In this paper, we propose a framework for solving a single-agent task by using multiple agents, each focusing on different aspects of the task.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.