Meta Reinforcement Learning
65 papers with code • 3 benchmarks • 1 datasets
Autoregressive generative models consistently achieve the best results in density estimation tasks involving high dimensional data, such as images or audio.
Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.
In our approach, we perform online probabilistic filtering of latent task variables to infer how to solve a new task from small amounts of experience.
Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.
In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation.
Despite significant progress, deep reinforcement learning (RL) suffers from data-inefficiency and limited generalization.