A robot operating in a household environment will see a wide range of unique and unfamiliar objects.
This formalism is implemented in three steps: assigning a consistent local coordinate frame to the task-relevant object parts, determining the location and orientation of this coordinate frame on unseen object instances, and executing an action that brings these frames into the desired alignment.
Decision-making is challenging in robotics environments with continuous object-centric states, continuous actions, long horizons, and sparse feedback.
In this work, we study generalized policy search-based methods with a focus on the score function used to guide the search over policies.
Our key idea is to learn predicates by optimizing a surrogate objective that is tractable but faithful to our real efficient-planning objective.
no code implementations • 28 Oct 2021 • Nicholas Roy, Ingmar Posner, Tim Barfoot, Philippe Beaudoin, Yoshua Bengio, Jeannette Bohg, Oliver Brock, Isabelle Depatie, Dieter Fox, Dan Koditschek, Tomas Lozano-Perez, Vikash Mansinghka, Christopher Pal, Blake Richards, Dorsa Sadigh, Stefan Schaal, Gaurav Sukhatme, Denis Therien, Marc Toussaint, Michiel Van de Panne
Machine learning has long since become a keystone technology, accelerating science and applications in a broad range of domains.
In robotic domains, learning and planning are complicated by continuous state spaces, continuous action spaces, and long task horizons.
We then propose a bottom-up relational learning method for operator learning and show how the learned operators can be used for planning in a TAMP system.
The robot may be called upon later to retrieve objects and will need a long-term object-based memory in order to know how to find them.
Adding auxiliary losses to the main objective function is a general way of encoding biases that can help networks learn better representations.
We conclude that learning to predict a sufficient set of objects for a planning problem is a simple, powerful, and general mechanism for planning in large instances.
A general meta-planning strategy is to learn to impose constraints on the states considered and actions taken by the agent.
We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime.
We address the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards.
Such models, however, are approximate, which limits their applicability.
This paper introduces the Differentiable Algorithm Network (DAN), a composable architecture for robot learning systems.
We explore the use of graph neural networks (GNNs) to model spatial processes in which there is no a priori graphical structure.
Modular meta-learning is a new framework that generalizes to unseen datasets by combining a small set of neural modules in different ways.
In this paper, we propose a learning algorithm that speeds up the search in task and motion planning problems.
In many applications that involve processing high-dimensional data, it is important to identify a small set of entities that account for a significant fraction of detections.
For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth.
In this paper we address planning problems in high-dimensional hybrid configuration spaces, with a particular focus on manipulation planning problems involving many objects.
The multiple instance problem arises in tasks where the training examples are ambiguous: a single example object may have many alternative feature vectors (instances) that describe it, and yet only one of those feature vectors may be responsible for the observed classification of the object.