no code implementations • 26 Sep 2022 • Lunjun Zhang, Bradly C. Stadie
Intuitively, learning from those arbitrary demonstrations can be seen as a form of imitation learning (IL).
1 code implementation • 25 Nov 2020 • Lunjun Zhang, Ge Yang, Bradly C. Stadie
Planning - the ability to analyze the structure of a problem in the large and decompose it into interrelated subproblems - is a hallmark of human intelligence.
no code implementations • 23 Aug 2018 • Sören R. Künzel, Bradly C. Stadie, Nikita Vemuri, Varsha Ramakrishnan, Jasjeet S. Sekhon, Pieter Abbeel
We develop new algorithms for estimating heterogeneous treatment effects, combining recent developments in transfer learning for neural networks with insights from the causal inference literature.
7 code implementations • ICLR 2018 • Bradly C. Stadie, Ge Yang, Rein Houthooft, Xi Chen, Yan Duan, Yuhuai Wu, Pieter Abbeel, Ilya Sutskever
We consider the problem of exploration in meta reinforcement learning.
3 code implementations • NeurIPS 2018 • Rein Houthooft, Richard Y. Chen, Phillip Isola, Bradly C. Stadie, Filip Wolski, Jonathan Ho, Pieter Abbeel
We propose a metalearning approach for learning gradient-based reinforcement learning (RL) algorithms.
no code implementations • NeurIPS 2017 • Yan Duan, Marcin Andrychowicz, Bradly C. Stadie, Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, Wojciech Zaremba
A neural net is trained that takes as input one demonstration and the current state (which initially is the initial state of the other demonstration of the pair), and outputs an action with the goal that the resulting sequence of states and actions matches as closely as possible with the second demonstration.
1 code implementation • 6 Mar 2017 • Bradly C. Stadie, Pieter Abbeel, Ilya Sutskever
A key difficulty in reinforcement learning is specifying a reward function for the agent to optimize.
1 code implementation • 3 Jul 2015 • Bradly C. Stadie, Sergey Levine, Pieter Abbeel
By parameterizing our learned model with a neural network, we are able to develop a scalable and efficient approach to exploration bonuses that can be applied to tasks with complex, high-dimensional state spaces.
Ranked #24 on Atari Games on Atari 2600 Q*Bert