Search Results for author: Joseph Modayil

Found 12 papers, 1 papers with code

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

no code implementations17 Mar 2022 Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White

Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently.

Decision Making reinforcement-learning

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making

no code implementations11 Jan 2022 Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A. Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil, Patrick M. Pilarski

We further show how to computationally build this adaptive signalling process out of a fixed signalling process, characterized by fast continual prediction learning and minimal constraints on the nature of the agent receiving signals.

Decision Making reinforcement-learning

Adapting the Function Approximation Architecture in Online Reinforcement Learning

no code implementations17 Jun 2021 John D. Martin, Joseph Modayil

However, prevailing optimization techniques are not designed for strictly-incremental online updates.


On Inductive Biases in Deep Reinforcement Learning

no code implementations5 Jul 2019 Matteo Hessel, Hado van Hasselt, Joseph Modayil, David Silver

These inductive biases can take many forms, including domain knowledge and pretuned hyper-parameters.

Continuous Control reinforcement-learning

Ray Interference: a Source of Plateaus in Deep Reinforcement Learning

no code implementations25 Apr 2019 Tom Schaul, Diana Borsa, Joseph Modayil, Razvan Pascanu

Rather than proposing a new method, this paper investigates an issue present in existing learning algorithms.


Deep Reinforcement Learning and the Deadly Triad

no code implementations6 Dec 2018 Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory reinforcement-learning

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations16 Nov 2018 Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Universal Option Models

no code implementations NeurIPS 2014 Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar

We prove that the UOM of an option can construct a traditional option model given a reward function, and the option-conditional return is computed directly by a single dot-product of the UOM with the reward function.

Multi-timescale Nexting in a Reinforcement Learning Robot

no code implementations6 Dec 2011 Joseph Modayil, Adam White, Richard S. Sutton

The term "nexting" has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense.


Cannot find the paper you are looking for? You can Submit a new open access paper.