Guiding Policies with Language via Meta-Learning

ICLR 2019 John D. Co-ReyesAbhishek GuptaSuvansh SanjeevNick AltieriJacob AndreasJohn DeNeroPieter AbbeelSergey Levine

Behavioral skills or policies for autonomous agents are conventionally learned from reward functions, via reinforcement learning, or from demonstrations, via imitation learning. However, both modes of task specification have their disadvantages: reward functions require manual engineering, while demonstrations require a human expert to be able to actually perform the task in order to generate the demonstration... (read more)

PDF Abstract

Evaluation results from the paper


  Submit results from this paper to get state-of-the-art GitHub badges and help community compare results to other papers.