Search Results for author: Joseph Modayil

Found 15 papers, 2 papers with code

The Ungrounded Alignment Problem

1 code implementation8 Aug 2024 Marc Pickett, Aakash Kumar Nain, Joseph Modayil, Llion Jones

This paper examines a simplified version of the general problem, where an unsupervised learner is presented with a sequence of images for the characters in a text corpus, and this learner is later evaluated on its ability to recognize specific (possibly rare) sequential patterns.

Towards model-free RL algorithms that scale well with unstructured data

no code implementations3 Nov 2023 Joseph Modayil, Zaheer Abbas

RL practitioners lack a systematic way to study how well a single RL algorithm performs when instantiated across a range of problem scales, and they lack function approximation techniques that scale well with unstructured observations.

Reinforcement Learning (RL)

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

no code implementations17 Mar 2022 Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White

Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently.

Decision Making reinforcement-learning +1

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making

no code implementations11 Jan 2022 Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A. Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil, Patrick M. Pilarski

We further show how to computationally build this adaptive signalling process out of a fixed signalling process, characterized by fast continual prediction learning and minimal constraints on the nature of the agent receiving signals.

Decision Making reinforcement-learning +1

Deep Reinforcement Learning and the Deadly Triad

no code implementations6 Dec 2018 Hado van Hasselt, Yotam Doron, Florian Strub, Matteo Hessel, Nicolas Sonnerat, Joseph Modayil

In this work, we investigate the impact of the deadly triad in practice, in the context of a family of popular deep reinforcement learning models - deep Q-networks trained with experience replay - analysing how the components of this system play a role in the emergence of the deadly triad, and in the agent's performance

Learning Theory reinforcement-learning +2

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations16 Nov 2018 Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Universal Option Models

no code implementations NeurIPS 2014 Hengshuai Yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar

We prove that the UOM of an option can construct a traditional option model given a reward function, and the option-conditional return is computed directly by a single dot-product of the UOM with the reward function.

Multi-timescale Nexting in a Reinforcement Learning Robot

no code implementations6 Dec 2011 Joseph Modayil, Adam White, Richard S. Sutton

The term "nexting" has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense.

reinforcement-learning Reinforcement Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.