no code implementations • 29 Sep 2023 • Zhanpeng He, Matei Ciocarlie
We introduce MORPH, a method for co-optimization of hardware design parameters and control policies in simulation using reinforcement learning.
no code implementations • 12 Mar 2023 • Siddharth Singi, Zhanpeng He, Alvin Pan, Sandip Patel, Gunnar A. Sigurdsson, Robinson Piramuthu, Shuran Song, Matei Ciocarlie
In a Human-in-the-Loop paradigm, a robotic agent is able to act mostly autonomously in solving a task, but can request help from an external expert when needed.
no code implementations • 13 Sep 2021 • Zhenjia Xu, Zhanpeng He, Shuran Song
We introduce the Universal Manipulation Policy Network (UMPNet) -- a single image-based policy network that infers closed-loop action sequences for manipulating arbitrary articulated objects.
2 code implementations • 3 Nov 2020 • Zhenjia Xu, Zhanpeng He, Jiajun Wu, Shuran Song
3D scene representation for robot manipulation should capture three key object properties: permanency -- objects that become occluded over time continue to exist; amodal completeness -- objects have 3D occupancy, even if only partial observations are available; spatiotemporal continuity -- the movement of each object is continuous over space and time.
no code implementations • 10 Mar 2020 • Bohan Wu, Feng Xu, Zhanpeng He, Abhi Gupta, Peter K. Allen
This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration.
8 code implementations • 24 Oct 2019 • Tianhe Yu, Deirdre Quillen, Zhanpeng He, Ryan Julian, Avnish Narayan, Hayden Shively, Adithya Bellathur, Karol Hausman, Chelsea Finn, Sergey Levine
Therefore, if the aim of these methods is to enable faster acquisition of entirely new behaviors, we must evaluate them on task distributions that are sufficiently broad to enable generalization to new behaviors.
Ranked #1 on Meta-Learning on ML10
1 code implementation • 4 Oct 2018 • Zhanpeng He, Ryan Julian, Eric Heiden, Hejia Zhang, Stefan Schaal, Joseph J. Lim, Gaurav Sukhatme, Karol Hausman
We complete unseen tasks by choosing new sequences of skill latents to control the robot using MPC, where our MPC model is composed of the pre-trained skill policy executed in the simulation environment, run in parallel with the real robot.
1 code implementation • 26 Sep 2018 • Ryan Julian, Eric Heiden, Zhanpeng He, Hejia Zhang, Stefan Schaal, Joseph J. Lim, Gaurav Sukhatme, Karol Hausman
In particular, we first use simulation to jointly learn a policy for a set of low-level skills, and a "skill embedding" parameterization which can be used to compose them.