no code implementations • 29 Sep 2021 • Parsa Mahmoudieh, Sayna Ebrahimi, Deepak Pathak, Trevor Darrell
Reward signals in reinforcement learning can be expensive signals in many tasks and often require access to direct state.
no code implementations • 25 Sep 2019 • Parsa Mahmoudieh, Trevor Darrell, Deepak Pathak
Instead of direct manual supervision which is tedious and prone to bias, in this work, our goal is to extract reusable skills from a collection of human demonstrations collected directly for several end-tasks.
1 code implementation • ICLR 2018 • Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A. Efros, Trevor Darrell
In our framework, the role of the expert is only to communicate the goals (i. e., what to imitate) during inference.
no code implementations • 21 Dec 2016 • Evan Shelhamer, Parsa Mahmoudieh, Max Argus, Trevor Darrell
Reinforcement learning optimizes policies for expected cumulative reward.