Beyond Pick-and-Place: Tackling Robotic Stacking of Diverse Shapes
We study the problem of robotic stacking with objects of complex geometry. We propose a challenging and diverse set of such objects that was carefully designed to require strategies beyond a simple "pick-and-place" solution. Our method is a reinforcement learning (RL) approach combined with vision-based interactive policy distillation and simulation-to-reality transfer. Our learned policies can efficiently handle multiple object combinations in the real world and exhibit a large variety of stacking skills. In a large experimental study, we investigate what choices matter for learning such general vision-based agents in simulation, and what affects optimal transfer to the real robot. We then leverage data collected by such policies and improve upon them with offline RL. A video and a blog post of our work are provided as supplementary material.
PDF AbstractCode
Datasets
Introduced in the Paper:

Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Skill Generalization | RGB-Stacking | BC - IMP | Group 1 | 23 | # 2 | |
Group 2 | 39.3 | # 1 | ||||
Group 3 | 39.3 | # 2 | ||||
Group 4 | 77.5 | # 1 | ||||
Group 5 | 66 | # 2 | ||||
Average | 49 | # 2 | ||||
Skill Mastery | RGB-Stacking | BC-IMP | Group 1 | 75.6 | # 1 | |
Group 2 | 60.8 | # 1 | ||||
Group 3 | 70.8 | # 2 | ||||
Group 4 | 87.8 | # 2 | ||||
Group 5 | 78.3 | # 2 | ||||
Average | 74.6 | # 2 |