Reactive Long Horizon Task Execution via Visual Skill and Precondition Models
Zero-shot execution of unseen robotic tasks is important to allowing robots to perform a wide variety of tasks in human environments, but collecting the amounts of data necessary to train end-to-end policies in the real-world is often infeasible. We describe an approach for sim-to-real training that can accomplish unseen robotic tasks using models learned in simulation to ground components of a simple task planner. We learn a library of parameterized skills, along with a set of predicates-based preconditions and termination conditions, entirely in simulation. We explore a block-stacking task because it has a clear structure, where multiple skills must be chained together, but our methods are applicable to a wide range of other problems and domains, and can transfer from simulation to the real-world with no fine tuning. The system is able to recognize failures and accomplish long-horizon tasks from perceptual input, which is critical for real-world execution. We evaluate our proposed approach in both simulation and in the real-world, showing an increase in success rate from 91.6% to 98% in simulation and from 10% to 80% success rate in the real-world as compared with naive baselines. For experiment videos including both real-world and simulation, see: https://www.youtube.com/playlist?list=PL-oD0xHUngeLfQmpngYkGFZarstfPOXqX
PDF Abstract