Meta Reinforcement Learning for Fast Adaptation of Hierarchical Policies

NeurIPS 2021  ·  David Kuric, Herke van Hoof ·

Hierarchical methods have the potential to allow reinforcement learning to scale to larger environments. Decomposing a task into transferable components, however, remains a challenging problem. In this paper, we propose a meta-learning approach for learning such a decomposition within the options framework. We formulate the objective as a bi-level optimization problem in which sub-policies and their terminations should facilitate fast learning on a family of tasks. Once such a set of options is obtained, it can then be used in new tasks where only the sequencing of options needs to be chosen. Our formalism tends to result in options where fewer decisions are needed to solve such new tasks. Experimentally, we show that our method is able to learn transferable components which accelerate learning and performs better than existing methods developed for this setting in the challenging ant maze locomotion task.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here