In this work, we tackle the problem of open-ended learning by introducing a method that simultaneously evolves agents and increasingly challenging environments.
We propose world value functions (WVFs), a type of goal-oriented general value function that represents how to solve not just a given task, but any other goal-reaching task in an agent's environment.
We introduce skill machines, a representation that can be learned directly from these reward machines that encode the solution to such tasks.
In this work we propose world value functions (WVFs), which are a type of general value function with mastery of the world - they represent not only how to solve a given task, but also how to solve any other goal-reaching task.
In this work, we investigate the properties of data that cause popular representation learning approaches to fail.
We further demonstrate that a fixed wavelet basis set performs comparably against the high-performing Fourier basis on Mountain Car and Acrobot, and that the adaptive methods provide a convenient approach to addressing an oversized initial basis set, while demonstrating performance comparable to, or greater than, the fixed wavelet basis.
Procedurally generated video game content has the potential to drastically reduce the content creation budget of game developers and large studios.
Learning disentangled representations with variational autoencoders (VAEs) is often attributed to the regularisation component of the loss.
Graph neural networks (GNNs) build on the success of deep learning models by extending them for use in graph spaces.
With increasing interest in procedural content generation by academia and game developers alike, it is vital that different approaches can be compared fairly.
We propose a framework that learns to execute natural language instructions in an environment consisting of goal-reaching tasks that share components of their task descriptions.
We leverage logical composition in reinforcement learning to create a framework that enables an agent to autonomously determine whether a new task can be immediately solved using its existing abilities, or whether a task-specific skill should be learned.
Such representations can immediately be transferred between tasks that share the same types of objects, resulting in agents that require fewer samples to learn a model of a new task.
The ability to produce novel behaviours from existing skills is an important property of lifelong learning agents.
We present a method for learning options from segmented demonstration trajectories.
Deep neural networks are typically too computationally expensive to run in real-time on consumer-grade hardware and low-powered devices.
Ranked #1 on Network Pruning on CIFAR-10 (Inference Time (ms) metric)
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
An important property for lifelong-learning agents is the ability to combine existing skills to solve unseen tasks.