no code implementations • 18 Dec 2023 • Tristan Bester, Benjamin Rosman, Steven James, Geraud Nangue Tasse
We present counting reward automata-a finite state machine variant capable of modelling any reward function expressible as a formal language.
1 code implementation • 31 May 2023 • Geraud Nangue Tasse, Tamlin Love, Mark Nemecek, Steven James, Benjamin Rosman
A common solution is for a human expert to define either a penalty in the reward function or a cost to be minimised when reaching unsafe states.
no code implementations • 23 Jun 2022 • Geraud Nangue Tasse, Benjamin Rosman, Steven James
We propose world value functions (WVFs), a type of goal-oriented general value function that represents how to solve not just a given task, but any other goal-reaching task in an agent's environment.
no code implementations • 25 May 2022 • Geraud Nangue Tasse, Devon Jarvis, Steven James, Benjamin Rosman
The agent can then flexibly compose them both logically and temporally to provably achieve temporal logic specifications in any regular language, such as regular fragments of linear temporal logic.
no code implementations • 18 May 2022 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
In this work we propose world value functions (WVFs), which are a type of general value function with mastery of the world - they represent not only how to solve a given task, but also how to solve any other goal-reaching task.
1 code implementation • 1 Mar 2022 • JunKyu Lee, Michael Katz, Don Joven Agravante, Miao Liu, Geraud Nangue Tasse, Tim Klinger, Shirin Sohrabi
Our approach defines options in hierarchical reinforcement learning (HRL) from AIP operators by establishing a correspondence between the state transition model of AI planning problem and the abstract state transition system of a Markov Decision Process (MDP).
no code implementations • 9 Oct 2021 • Vanya Cohen, Geraud Nangue Tasse, Nakul Gopalan, Steven James, Matthew Gombolay, Benjamin Rosman
We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks that share components of their task descriptions.
no code implementations • ICLR 2022 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
We leverage logical composition in reinforcement learning to create a framework that enables an agent to autonomously determine whether a new task can be immediately solved using its existing abilities, or whether a task-specific skill should be learned.
no code implementations • ICML Workshop LifelongML 2020 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
The ability to produce novel behaviours from existing skills is an important property of lifelong learning agents.
1 code implementation • NeurIPS 2020 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
The ability to compose learned skills to solve new tasks is an important property of lifelong-learning agents.