We study the problem of robotic stacking with objects of complex geometry. We propose a challenging and diverse set of such objects that was carefully designed to require strategies beyond a simple "pick-and-place" solution. Our method is a reinforcement learning (RL) approach combined with vision-based interactive policy distillation and simulation-to-reality transfer. Our learned policies can efficiently handle multiple object combinations in the real world and exhibit a large variety of stacking skills. In a large experimental study, we investigate what choices matter for learning such general vision-based agents in simulation, and what affects optimal transfer to the real robot. We then leverage data collected by such policies and improve upon them with offline RL. A video and a blog post of our work are provided as supplementary material.

PDF Abstract

Datasets


Introduced in the Paper:

RGB-Stacking
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Skill Generalization RGB-Stacking BC - IMP Group 1 23 # 2
Group 2 39.3 # 1
Group 3 39.3 # 2
Group 4 77.5 # 1
Group 5 66 # 2
Average 49 # 2
Skill Mastery RGB-Stacking BC-IMP Group 1 75.6 # 1
Group 2 60.8 # 1
Group 3 70.8 # 2
Group 4 87.8 # 2
Group 5 78.3 # 2
Average 74.6 # 2

Methods


No methods listed for this paper. Add relevant methods here