no code implementations • 15 Nov 2013 • M. M. Hassan Mahmud, Majd Hawasly, Benjamin Rosman, Subramanian Ramamoorthy
The source subset forms an `$\epsilon$-net' over the original set of MDPs, in the sense that for each previous MDP $M_p$, there is a source $M^s$ whose optimal policy has $<\epsilon$ regret in $M_p$.
no code implementations • 1 May 2015 • Benjamin Rosman, Majd Hawasly, Subramanian Ramamoorthy
We formalise the problem of policy reuse, and present an algorithm for efficiently responding to a novel task instance by reusing a policy from the library of existing policies, where the choice is based on observed 'signals' which correlate to policy performance.
no code implementations • 8 Dec 2016 • Andrew M. Saxe, Adam Earle, Benjamin Rosman
Hierarchical architectures are critical to the scalability of reinforcement learning methods.
no code implementations • ICLR 2018 • Adam C. Earle, Andrew M. Saxe, Benjamin Rosman
Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains.
no code implementations • ICML 2017 • Andrew M. Saxe, Adam C. Earle, Benjamin Rosman
Hierarchical architectures are critical to the scalability of reinforcement learning methods.
no code implementations • 10 Jan 2018 • Craig Innes, Alex Lascarides, Stefano V. Albrecht, Subramanian Ramamoorthy, Benjamin Rosman
Methods for learning optimal policies in autonomous agents often assume that the way the domain is conceptualised---its possible states and actions and their causal structure---is known in advance and does not change during learning.
no code implementations • 26 Jan 2018 • Tadahiro Taniguchi, Emre Ugur, Matej Hoffmann, Lorenzo Jamone, Takayuki Nagai, Benjamin Rosman, Toshihiko Matsuka, Naoto Iwahashi, Erhan Oztop, Justus Piater, Florentin Wörgötter
However, the symbol grounding problem was originally posed to connect symbolic AI and sensorimotor information and did not consider many interdisciplinary phenomena in human communication and dynamic symbol systems in our society, which semiotics considered.
no code implementations • 12 Jul 2018 • Benjamin van Niekerk, Steven James, Adam Earle, Benjamin Rosman
An important property for lifelong-learning agents is the ability to combine existing skills to solve unseen tasks.
no code implementations • 27 Sep 2018 • Adam C Earle, Andrew M Saxe, Benjamin Rosman
Exploration is a well known challenge in Reinforcement Learning.
Hierarchical Reinforcement Learning reinforcement-learning +1
no code implementations • NeurIPS 2018 • Ofir Marom, Benjamin Rosman
Object-oriented representations in reinforcement learning have shown promise in transfer learning, with previous research introducing a propositional object-oriented framework that has provably efficient learning bounds with respect to sample complexity.
1 code implementation • 15 Jan 2019 • Montaser Mohammedalamen, Waleed D. Khamies, Benjamin Rosman
In this paper, We Apply Reinforcement learning (RL) techniques to train a realistic biomechanical model to work with different people and on different walking environments.
no code implementations • ICML 2020 • Steven James, Benjamin Rosman, George Konidaris
We present a framework for autonomously learning a portable representation that describes a collection of low-level continuous environments.
no code implementations • 29 Aug 2019 • Adam Pantanowitz, Emmanuel Cohen, Philippe Gradidge, Nigel Crowther, Vered Aharonson, Benjamin Rosman, David M Rubin
Obesity is an important concern in public health, and Body Mass Index is one of the useful (and proliferant) measures.
no code implementations • 25 Sep 2019 • Devon Jarvis, Richard Klein, Benjamin Rosman
The efficacy of the width of the basin of attraction surrounding a minimum in parameter space as an indicator for the generalizability of a model parametrization is a point of contention surrounding the training of artificial neural networks, with the dominant view being that wider areas in the landscape reflect better generalizability by the trained model.
no code implementations • 13 Oct 2019 • Arnu Pretorius, Elan van Biljon, Benjamin van Niekerk, Ryan Eloff, Matthew Reynard, Steve James, Benjamin Rosman, Herman Kamper, Steve Kroon
Our results therefore suggest that, in the shallow-to-moderate depth setting, critical initialisation provides zero performance gains when compared to off-critical initialisations and that searching for off-critical initialisations that might improve training speed or generalisation, is likely to be a fruitless endeavour.
1 code implementation • NeurIPS 2020 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
The ability to compose learned skills to solve new tasks is an important property of lifelong-learning agents.
no code implementations • 7 Apr 2020 • Benjamin van Niekerk, Andreas Damianou, Benjamin Rosman
The environment's dynamics are learned from limited training data and can be reused in new task instances without retraining.
no code implementations • ICML Workshop LifelongML 2020 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
The ability to produce novel behaviours from existing skills is an important property of lifelong learning agents.
no code implementations • ICLR 2022 • Steven James, Benjamin Rosman, George Konidaris
Such representations can immediately be transferred between tasks that share the same types of objects, resulting in agents that require fewer samples to learn a model of a new task.
no code implementations • 2 Feb 2021 • Kale-ab Tessera, Sara Hooker, Benjamin Rosman
Based upon these findings, we show that gradient flow in sparse networks can be improved by reconsidering aspects of the architecture design and the training regime.
no code implementations • ICLR 2022 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
We leverage logical composition in reinforcement learning to create a framework that enables an agent to autonomously determine whether a new task can be immediately solved using its existing abilities, or whether a task-specific skill should be learned.
no code implementations • 29 Sep 2021 • Devon Jarvis, Richard Klein, Benjamin Rosman, Andrew M Saxe
We introduce a minimal space of datasets with systematic and non-systematic features in both the input and output.
no code implementations • 9 Oct 2021 • Vanya Cohen, Geraud Nangue Tasse, Nakul Gopalan, Steven James, Matthew Gombolay, Benjamin Rosman
We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks that share components of their task descriptions.
no code implementations • 4 May 2022 • Steven James, Benjamin Rosman, George Konidaris
We propose a framework for autonomously learning state abstractions of an agent's environment, given a set of skills.
no code implementations • 18 May 2022 • Geraud Nangue Tasse, Steven James, Benjamin Rosman
In this work we propose world value functions (WVFs), which are a type of general value function with mastery of the world - they represent not only how to solve a given task, but also how to solve any other goal-reaching task.
no code implementations • 25 May 2022 • Geraud Nangue Tasse, Devon Jarvis, Steven James, Benjamin Rosman
The agent can then flexibly compose them both logically and temporally to provably achieve temporal logic specifications in any regular language, such as regular fragments of linear temporal logic.
no code implementations • 23 Jun 2022 • Geraud Nangue Tasse, Benjamin Rosman, Steven James
We propose world value functions (WVFs), a type of goal-oriented general value function that represents how to solve not just a given task, but any other goal-reaching task in an agent's environment.
no code implementations • 16 Oct 2022 • Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman
The ability to generate synthetic data has a variety of use cases across different domains.
no code implementations • 16 Oct 2022 • Herkulaas MvE Combrink, Vukosi Marivate, Benjamin Rosman
While much effort and detail has gone into the expansion of explaining algorithmic decision making in this context, there is still a need to develop data collection strategies Therefore, the purpose of this paper is to outline a data collection framework specific to recommender systems within this context in order to reduce collection biases, understand student characteristics, and find an ideal way to infer optimal influences on the student journey.
no code implementations • 1 Nov 2022 • Herkulaas Combrink, Vukosi Marivate, Benjamin Rosman
Advances in reinforcement learning research have demonstrated the ways in which different agent-based models can learn how to optimally perform a task within a given environment.
1 code implementation • 3 Feb 2023 • Michael Beukman, Manuel Fokam, Marcel Kruger, Guy Axelrod, Muhammad Nasir, Branden Ingram, Benjamin Rosman, Steven James
Procedural content generation (PCG) is a growing field, with numerous applications in the video game industry and great potential to help create better games at a fraction of the cost of manual creation.
no code implementations • 28 Mar 2023 • Siddarth Singh, Benjamin Rosman
In the field of cooperative multi-agent reinforcement learning (MARL), the standard paradigm is the use of centralised training and decentralised execution where a central critic conditions the policies of the cooperative agents based on a central state.
1 code implementation • 31 May 2023 • Geraud Nangue Tasse, Tamlin Love, Mark Nemecek, Steven James, Benjamin Rosman
A common solution is for a human expert to define either a penalty in the reward function or a cost to be minimised when reaching unsafe states.
1 code implementation • 15 Aug 2023 • Rowan Hodson, Bruce Bassett, Charel van Hoof, Benjamin Rosman, Mark Solms, Jonathan P. Shock, Ryan Smith
First, we compare performance of SI to Bayesian reinforcement learning (RL) schemes designed to solve similar problems.
2 code implementations • NeurIPS 2023 • Michael Beukman, Devon Jarvis, Richard Klein, Steven James, Benjamin Rosman
To this end, we introduce a neural network architecture, the Decision Adapter, which generates the weights of an adapter module and conditions the behaviour of an agent on the context information.
no code implementations • 30 Nov 2023 • Kale-ab Tessera, Callum Rhys Tilbury, Sasha Abramowitz, Ruan de Kock, Omayma Mahjoub, Benjamin Rosman, Sara Hooker, Arnu Pretorius
Optimising deep neural networks is a challenging task due to complex training dynamics, high computational requirements, and long training times.
no code implementations • 18 Dec 2023 • Tristan Bester, Benjamin Rosman, Steven James, Geraud Nangue Tasse
We present counting reward automata-a finite state machine variant capable of modelling any reward function expressible as a formal language.
no code implementations • 16 Feb 2024 • Tristan Bester, Benjamin Rosman
Financial inclusion ensures that individuals have access to financial products and services that meet their needs.