Framing inference as the inner-loop optimization of meta-learning leads to a model-based approach that is more data-efficient and capable of estimating the state of entities that we do not observe directly, but whose existence can be inferred from their effect on observed entities.
This paper introduces an approach for learning to solve continuous constraint satisfaction problems (CCSP) in robotic reasoning and planning.
Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization.
A robot deployed in a home over long stretches of time faces a true lifelong learning problem.
Task and Motion Planning (TAMP) approaches are effective at planning long-horizon autonomous robot manipulation.
We investigate whether LLMs can serve as generalized planners: given a domain and training tasks, generate a program that efficiently produces plans for other tasks in the domain.
Reasoning about the relationships between entities from input facts (e. g., whether Ari is a grandparent of Charlie) generally requires explicit consideration of other entities that are not mentioned in the query (e. g., the parents of Charlie).
We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals.
Next, we analyze the learning properties of these neural networks, especially focusing on how they can be trained on a finite set of small graphs and generalize to larger graphs, which we term structural generalization.
This paper studies a model learning and online planning approach towards building flexible and general robots.
This formalism is implemented in three steps: assigning a consistent local coordinate frame to the task-relevant object parts, determining the location and orientation of this coordinate frame on unseen object instances, and executing an action that brings these frames into the desired alignment.
An effective approach to solving long-horizon tasks in robotics domains with continuous state and action spaces is bilevel planning, wherein a high-level search over an abstraction of an environment is used to guide low-level decision-making.
Decision-making is challenging in robotics environments with continuous object-centric states, continuous actions, long horizons, and sparse feedback.
In this work, we study generalized policy search-based methods with a focus on the score function used to guide the search over policies.
Our key idea is to learn predicates by optimizing a surrogate objective that is tractable but faithful to our real efficient-planning objective.
The first is an algorithm for learning a rank function that guides the discrete task level search, and the second is an algorithm for learning a sampler that guides the continuous motionlevel search.
In this paper, we propose to leverage domain-independent heuristic functions commonly used in the classical planning literature to improve the sample efficiency of RL.
Our first contribution is a fine-grained analysis of the expressiveness of these neural networks, that is, the set of functions that they can realize and the set of problems that they can solve.
We present a framework for learning compositional, rational skill models (RatSkills) that support efficient planning and inverse planning for achieving novel goals and recognizing activities.
To leverage the sparsity in hypergraph neural networks, SpaLoc represents the grounding of relationships such as parent and grandparent as sparse tensors and uses neural networks and finite-domain quantification operations to infer new facts based on the input.
We present a strategy for designing and building very general robot manipulation systems involving the integration of a general-purpose task-and-motion planner with engineered and learned perception modules that estimate properties and affordances of unknown objects.
We present Temporal and Object Quantification Networks (TOQ-Nets), a new class of neuro-symbolic networks with a structural bias that enables them to learn to recognize complex relational-temporal events.
In robotic domains, learning and planning are complicated by continuous state spaces, continuous action spaces, and long task horizons.
We then propose a bottom-up relational learning method for operator learning and show how the learned operators can be used for planning in a TAMP system.
We aim to learn generalizable representations for complex activities by quantifying over both entities and time, as in “the kicker is behind all the other players,” or “the player controls the ball until it moves toward the goal.” Such a structural inductive bias of object relations, object quantification, and temporal orders will enable the learned representation to generalize to situations with varying numbers of agents, objects, and time courses.
Program induction lies at the opposite end of the spectrum: programs are capable of extrapolating from very few examples, but we still do not know how to efficiently search for complex programs.
The problem of planning for a robot that operates in environments containing a large number of objects, taking actions to move itself through the world as well as to change the state of the objects, is known as task and motion planning (TAMP).
When an agent interacts with a complex environment, it receives a stream of percepts in which it may detect entities, such as objects or people.
Adding auxiliary losses to the main objective function is a general way of encoding biases that can help networks learn better representations.
We conclude that learning to predict a sufficient set of objects for a planning problem is a simple, powerful, and general mechanism for planning in large instances.
A general meta-planning strategy is to learn to impose constraints on the states considered and actions taken by the agent.
We use, and develop novel improvements on, state-of-the-art methods for active learning and sampling.
We hypothesize that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent's life in order to expose it to experiences that enable it to obtain high rewards over the course of its lifetime.
We address the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards.
To solve multi-step manipulation tasks in the real world, an autonomous robot must take actions to observe its environment and react to unexpected observations.
This paper introduces the Differentiable Algorithm Network (DAN), a composable architecture for robot learning systems.
We explore the use of graph neural networks (GNNs) to model spatial processes in which there is no a priori graphical structure.
We propose an expressive class of policies, a strong but general prior, and a learning algorithm that, together, can learn interesting policies from very few examples.
Furthermore, as special cases of our general results, this article improves or complements several state-of-the-art theoretical results on deep neural networks, deep residual networks, and overparameterized deep neural networks with a unified proof technique and novel geometric insights.
At every local minimum of any deep neural network with these added neurons, the set of parameters of the original neural network (without added neurons) is guaranteed to be a global minimum of the original neural network.
In this paper, we analyze the effects of depth and width on the quality of local minima, without strong over-parameterization and simplification assumptions in the literature.
For any action, a rule selects a set of relevant objects and computes a distribution over properties of just those objects in the resulting state given their properties in the previous state.
Multi-object manipulation problems in continuous state and action spaces can be solved by planners that search over sampled values for the continuous parameters of operators.
In this paper, we propose a learning algorithm that speeds up the search in task and motion planning problems.
We consider such a setting in which the agent can, while acting, transmit declarative information to the human that helps them understand aspects of this unseen environment.
Solving long-horizon problems in complex domains requires flexible generative planning that can combine primitive abilities in novel combinations to solve problems as they arise in the world.
In partially observed environments, it can be useful for a human to provide the robot with declarative information that represents probabilistic relational constraints on properties of objects in the world, augmenting the robot's sensory observations.
We extend PDDL to support a generic, declarative specification for these procedures that treats their implementation as black boxes.
This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions.
In this paper we address this challenge by constructing a representative subset of examples that is both small and is able to constrain the solver sufficiently.
Program synthesis is a class of regression problems where one seeks a solution, in the form of a source-code program, mapping the inputs to their corresponding outputs exactly.
For such complex planning problems, unguided uniform sampling of actions until a path to a goal is found is hopelessly inefficient, and gradient-based approaches often fall short when the optimization manifold of a given problem is not smooth.
This paper provides theoretical insights into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, responding to an open question in the literature.
We introduce STRIPStream: an extension of the STRIPS language which can model these domains by supporting the specification of blackbox generators to handle complex constraints.
We introduce a framework for model learning and planning in stochastic domains with continuous state and action spaces and non-Gaussian transition models.
In this paper we address planning problems in high-dimensional hybrid configuration spaces, with a particular focus on manipulation planning problems involving many objects.
This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the delta-cover sampling.
We refer to this attribute-based representation as a world model, and consider how to acquire it via noisy perception and maintain it over time, as objects are added, changed, and removed in the world.