Recognizing Object Affordances to Support Scene Reasoning for Manipulation Tasks
Affordance information about a scene provides important clues as to what actions may be executed in pursuit of meeting a specified goal state. Thus, integrating affordance-based reasoning into symbolic action plannning pipelines would enhance the flexibility of robot manipulation. Unfortunately, the top performing affordance recognition methods use object category priors to boost the accuracy of affordance detection and segmentation. Object priors limit generalization to unknown object categories. This paper describes an affordance recognition pipeline based on a category-agnostic region proposal network for proposing instance regions of an image across categories. To guide affordance learning in the absence of category priors, the training process includes the auxiliary task of explicitly inferencing existing affordances within a proposal. Secondly, a self-attention mechanism trained to interpret each proposal learns to capture rich contextual dependencies through the region. Visual benchmarking shows that the trained network, called AffContext, reduces the performance gap between object-agnostic and object-informed affordance recognition. AffContext is linked to the Planning Domain Definition Language (PDDL) with an augmented state keeper for action planning across temporally spaced goal-oriented tasks. Manipulation experiments show that AffContext can successfully parse scene content to seed a symbolic planner problem specification, whose execution completes the target task. Additionally, task-oriented grasping for cutting and pounding actions demonstrate the exploitation of multiple affordances for a given object to complete specified tasks.
PDF Abstract