no code implementations • NeurIPS 2018 • Bradly Stadie, Ge Yang, Rein Houthooft, Peter Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever
Results are presented on a new environment we call `Krazy World': a difficult high-dimensional gridworld which is designed to highlight the importance of correctly differentiating through sampling distributions in meta-reinforcement learning.
7 code implementations • ICLR 2018 • Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever
We consider the problem of exploration in meta reinforcement learning.
3 code implementations • NeurIPS 2018 • Rein Houthooft, Richard Y. Chen, Phillip Isola, Bradly C. Stadie, Filip Wolski, Jonathan Ho, Pieter Abbeel
We propose a metalearning approach for learning gradient-based reinforcement learning (RL) algorithms.
10 code implementations • ICLR 2018 • Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz
Combining parameter noise with traditional RL methods allows to combine the best of both worlds.
3 code implementations • NeurIPS 2017 • Haoran Tang, Rein Houthooft, Davis Foote, Adam Stooke, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
In this work, we describe a surprising finding: a simple generalization of the classic count-based approach can reach near state-of-the-art performance on various high-dimensional and/or continuous deep RL benchmarks.
Ranked #1 on
Atari Games
on Atari 2600 Freeway
38 code implementations • NeurIPS 2016 • Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel
This paper describes InfoGAN, an information-theoretic extension to the Generative Adversarial Network that is able to learn disentangled representations in a completely unsupervised manner.
Ranked #3 on
Image Generation
on Stanford Cars
2 code implementations • NeurIPS 2016 • Rein Houthooft, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel
While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios.
15 code implementations • 22 Apr 2016 • Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning.
Ranked #1 on
Continuous Control
on Inverted Pendulum
no code implementations • 3 Aug 2015 • Rein Houthooft, Filip De Turck
To improve prediction accuracy, this paper proposes: (i) Joint inference and learning by integration of back-propagation and loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM factors to neural networks that form highly nonlinear functions of input features.