The resulting latent collocation method (LatCo) optimizes trajectories of latent states, which improves over previously proposed shooting methods for visual model-based RL on tasks with sparse rewards and long-term goals.
Meta-reinforcement learning algorithms can enable autonomous agents, such as robots, to quickly acquire new behaviors by leveraging prior experience in a set of related training tasks.
Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations.
The goal in this paper is to develop a method for continual online learning from an incoming stream of data, using deep neural network models.
Although reinforcement learning methods can achieve impressive results in simulation, the real world presents two major challenges: generating samples is exceedingly expensive, and unexpected perturbations or unseen situations cause proficient but specialized policies to fail at test time.
We present an approach for controlling a real-world legged millirobot that is based on learned neural network models.
Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance.