no code implementations • 27 Nov 2024 • Neel Jawale, Byron Boots, Balakumar Sundaralingam, Mohak Bhardwaj
We investigate the problem of teaching a robot manipulator to perform dynamic non-prehensile object transport, also known as the `robot waiter' task, from a limited set of real-world demonstrations.
no code implementations • 26 Oct 2024 • Andrew Wagenmaker, Kevin Huang, Liyiming Ke, Byron Boots, Kevin Jamieson, Abhishek Gupta
To the best of our knowledge, this is the first evidence that simulation transfer yields a provable gain in reinforcement learning in settings where direct sim2real transfer fails.
1 code implementation • 13 Oct 2023 • Kevin Huang, Rwik Rana, Alexander Spitzer, Guanya Shi, Byron Boots
Precise arbitrary trajectory tracking for quadrotors is challenging due to unknown nonlinear dynamics, trajectory infeasibility, and actuation limits.
1 code implementation • 6 Oct 2023 • Jacob Sacks, Rwik Rana, Kevin Huang, Alex Spitzer, Guanya Shi, Byron Boots
A major challenge in robotics is to design robust policies which enable complex and agile behaviors in the real world.
no code implementations • ICCV 2023 • Amirreza Shaban, Joonho Lee, Sanghun Jung, Xiangyun Meng, Byron Boots
Existing self-training methods use a model trained on labeled source data to generate pseudo labels for target data and refine the predictions via fine-tuning the network on the pseudo labels.
no code implementations • 4 May 2023 • Boling Yang, Liyuan Zheng, Lillian J. Ratliff, Byron Boots, Joshua R. Smith
Autocurricular training is an important sub-area of multi-agent reinforcement learning~(MARL) that allows multiple agents to learn emergent skills in an unsupervised co-evolving scheme.
no code implementations • 17 Apr 2023 • Yuxiang Yang, Xiangyun Meng, Wenhao Yu, Tingnan Zhang, Jie Tan, Byron Boots
Jumping is essential for legged robots to traverse through difficult terrains.
1 code implementation • 30 Mar 2023 • Anqi Li, Byron Boots, Ching-An Cheng
We study a new paradigm for sequential decision making, called offline policy learning from observations (PLfO).
no code implementations • NeurIPS 2023 • Mohak Bhardwaj, Tengyang Xie, Byron Boots, Nan Jiang, Ching-An Cheng
We propose a novel model-based offline Reinforcement Learning (RL) framework, called Adversarial Model for Offline Reinforcement Learning (ARMOR), which can robustly learn policies to improve upon an arbitrary reference policy regardless of data coverage.
no code implementations • 5 Dec 2022 • Jacob Sacks, Byron Boots
This requires us to rely on a number of heuristics for generating samples and updating the distribution and may lead to sub-optimal performance.
no code implementations • 5 Dec 2022 • Jacob Sacks, Byron Boots
We show that we can contend with this noise by learning how to update the control distribution more effectively and make better use of the few samples that we have.
1 code implementation • 21 Oct 2022 • Adam Fishman, Adithyavairan Murali, Clemens Eppner, Bryan Peele, Byron Boots, Dieter Fox
Collision-free motion generation in unknown environments is a core building block for robot manipulation.
1 code implementation • 17 Oct 2022 • Carolina Higuera, Siyuan Dong, Byron Boots, Mustafa Mukadam
In experiments, we find that Neural Contact Fields are able to localize multiple contact patches without making any assumptions about the geometry of the contact, and capture contact/no-contact transitions for known categories of objects with unseen shapes in unseen environment configurations.
no code implementations • 27 Jun 2022 • Yuxiang Yang, Xiangyun Meng, Wenhao Yu, Tingnan Zhang, Jie Tan, Byron Boots
Using only 40 minutes of human demonstration data, our framework learns to adjust the speed and gait of the robot based on perceived terrain semantics, and enables the robot to walk over 6km without failure at close-to-optimal speed.
no code implementations • ICCV 2023 • Sanghun Jung, Jungsoo Lee, Nanhee Kim, Amirreza Shaban, Byron Boots, Jaegul Choo
That is, a model does not have a chance to learn test data in a class-discriminative manner, which was feasible in other adaptation tasks (\textit{e. g.,} unsupervised domain adaptation) via supervised losses on the source data.
no code implementations • 11 Apr 2022 • Julen Urain, An T. Le, Alexander Lambert, Georgia Chalvatzaki, Byron Boots, Jan Peters
In this paper, we focus on the problem of integrating Energy-based Models (EBM) as guiding priors for motion optimization.
no code implementations • 14 Feb 2022 • Boling Yang, Golnaz Habibi, Patrick E. Lancaster, Byron Boots, Joshua R. Smith
This project aims to motivate research in competitive human-robot interaction by creating a robot competitor that can challenge human users in certain scenarios such as physical exercise and games.
Multi-agent Reinforcement Learning Reinforcement Learning (RL)
no code implementations • 15 Nov 2021 • Hamid Izadinia, Byron Boots, Steven M. Seitz
Nonprehensile manipulation involves long horizon underactuated object interactions and physical contact with different objects that can inherently introduce a high degree of uncertainty.
no code implementations • 10 Oct 2021 • Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa
If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly.
1 code implementation • 16 Jun 2021 • Nolan Wagener, Byron Boots, Ching-An Cheng
We propose a new algorithm, SAILR, that uses an intervention mechanism based on advantage functions to keep the agent safe throughout training and optimizes the agent's policy using off-the-shelf RL algorithms designed for unconstrained MDPs.
no code implementations • 7 May 2021 • Mandy Xie, Anqi Li, Karl Van Wyk, Frank Dellaert, Byron Boots, Nathan Ratliff
Many IL methods, such as Dataset Aggregation (DAgger), combat challenges like distributional shift by interacting with oracular experts.
1 code implementation • 9 Apr 2021 • Yuxiang Yang, Tingnan Zhang, Erwin Coumans, Jie Tan, Byron Boots
We focus on the problem of developing energy efficient controllers for quadrupedal robots.
2 code implementations • 7 Apr 2021 • Nathan Hatch, Byron Boots
We show that that this value function can be used by MPC directly, resulting in more efficient and resilient behavior at runtime.
no code implementations • 25 Mar 2021 • Amirreza Shaban, Amir Rahimi, Thalaiyasingam Ajanthan, Byron Boots, Richard Hartley
When the novel objects are localized, we utilize them to learn a linear appearance model to detect novel classes in new images.
no code implementations • 10 Mar 2021 • Anqi Li, Ching-An Cheng, M. Asif Rana, Man Xie, Karl Van Wyk, Nathan Ratliff, Byron Boots
Using RMPflow as a structured policy class in learning has several benefits, such as sufficient expressiveness, the flexibility to inject different levels of prior knowledge as well as the ability to transfer policies between robots.
no code implementations • 7 Jan 2021 • Joris Guerin, Stephane Thiery, Eric Nyiri, Olivier Gibaru, Byron Boots
First, extensive experiments are conducted and show that, for a given dataset, the choice of the CNN architecture for feature extraction has a huge impact on the final clustering.
no code implementations • 24 Dec 2020 • M. Asif Rana, Anqi Li, Dieter Fox, Sonia Chernova, Byron Boots, Nathan Ratliff
The policy structure provides the user an interface to 1) specifying the spaces that are directly relevant to the completion of the tasks, and 2) designing policies for certain tasks that do not need to be learned.
no code implementations • ICLR 2021 • Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots
We further propose an algorithm that changes $\lambda$ over time to reduce the dependence on MPC as our estimates of the value function improve, and test the efficacy our approach on challenging high-dimensional manipulation tasks with biased models in simulation.
no code implementations • 15 Nov 2020 • Alexander Lambert, Adam Fishman, Dieter Fox, Byron Boots, Fabio Ramos
By casting MPC as a Bayesian inference problem, we employ variational methods for posterior computation, naturally encoding the complexity and multi-modality of the decision making problem.
no code implementations • 13 Nov 2020 • Liyiming Ke, Jingqiang Wang, Tapomayukh Bhattacharjee, Byron Boots, Siddhartha Srinivasa
Billions of people use chopsticks, a simple yet versatile tool, for fine manipulation of everyday objects.
no code implementations • 20 Oct 2020 • Siddarth Srinivasan, Sandesh Adhikary, Jacob Miller, Guillaume Rabusseau, Byron Boots
We address this gap by showing how stationary or uniform versions of popular quantum tensor network models have equivalent representations in the stochastic processes and weighted automata literature, in the limit of infinitely long sequences.
no code implementations • 21 Sep 2020 • Xingye Da, Zhaoming Xie, David Hoeller, Byron Boots, Animashree Anandkumar, Yuke Zhu, Buck Babich, Animesh Garg
We present a hierarchical framework that combines model-based control and reinforcement learning (RL) to synthesize robust controllers for a quadruped (the Unitree Laikago).
no code implementations • 6 Jul 2020 • Xinyan Yan, Byron Boots, Ching-An Cheng
Here policies are optimized by performing online learning on a sequence of loss functions that encourage the learner to mimic expert actions, and if the online learning has no regret, the agent can provably learn an expert-like policy.
2 code implementations • L4DC 2020 • Muhammad Asif Rana, Anqi Li, Dieter Fox, Byron Boots, Fabio Ramos, Nathan Ratliff
The complex motions are encoded as rollouts of a stable dynamical system, which, under a change of coordinates defined by a diffeomorphism, is equivalent to a simple, hand-specified dynamical system.
1 code implementation • ECCV 2020 • Amir Rahimi, Amirreza Shaban, Thalaiyasingam Ajanthan, Richard Hartley, Byron Boots
Weakly Supervised Object Localization (WSOL) methods only require image level labels as opposed to expensive bounding box annotations required by fully supervised algorithms.
1 code implementation • NeurIPS 2020 • Amir Rahimi, Amirreza Shaban, Ching-An Cheng, Richard Hartley, Byron Boots
A common approach is to learn a post-hoc calibration function that transforms the output of the original network into calibrated confidence scores while maintaining the network's accuracy.
no code implementations • 31 Dec 2019 • Mohak Bhardwaj, Ankur Handa, Dieter Fox, Byron Boots
Model-free Reinforcement Learning (RL) works well when experience can be collected cheaply and model-based RL is effective when system dynamics can be modeled accurately.
no code implementations • 3 Dec 2019 • Jonathan Lee, Ching-An Cheng, Ken Goldberg, Byron Boots
We prove that there is a fundamental equivalence between achieving sublinear dynamic regret in COL and solving certain EPs, and we present a reduction from dynamic regret to both static regret and convergence rate of the associated EP.
no code implementations • 2 Dec 2019 • Sandesh Adhikary, Siddarth Srinivasan, Geoff Gordon, Byron Boots
Extending classical probabilistic reasoning using the quantum mechanical view of probability has been of recent interest, particularly in the development of hidden quantum Markov models (HQMMs) to model stochastic processes.
no code implementations • 14 Nov 2019 • Ching-An Cheng, Remi Tachet des Combes, Byron Boots, Geoff Gordon
We present a reduction from reinforcement learning (RL) to no-regret online learning based on the saddle-point formulation of RL, by which "any" online algorithm with sublinear regret can generate policies with provable performance guarantees.
no code implementations • 13 Nov 2019 • Ajay Mandlekar, Fabio Ramos, Byron Boots, Silvio Savarese, Li Fei-Fei, Animesh Garg, Dieter Fox
For simple short-horizon manipulation tasks with modest variation in task instances, offline learning from a small set of demonstrations can produce controllers that successfully solve the task.
no code implementations • 7 Oct 2019 • Mustafa Mukadam, Ching-An Cheng, Dieter Fox, Byron Boots, Nathan Ratliff
RMPfusion supplements RMPflow with weight functions that can hierarchically reshape the Lyapunov functions of the subtask RMPs according to the current configuration of the robot and environment.
no code implementations • 8 Aug 2019 • Ching-An Cheng, Xinyan Yan, Byron Boots
This can be attributed, at least in part, to the high variance in estimating the gradient of the task objective with Monte Carlo methods.
no code implementations • 16 Jul 2019 • Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots, Siddhartha Srinivasa
If new search problems are sufficiently similar to problems solved during training, the learned policy will choose a good edge evaluation ordering and solve the motion planning problem quickly.
1 code implementation • 27 May 2019 • Wen Sun, Anirudh Vemula, Byron Boots, J. Andrew Bagnell
We design a new model-free algorithm for ILFO, Forward Adversarial Imitation Learning (FAIL), which learns a sequence of time-dependent policies by minimizing an Integral Probability Metric between the observation distributions of the expert policy and the learner.
no code implementations • ICLR 2020 • Ahmed H. Qureshi, Jacob J. Johnson, Yuzhe Qin, Taylor Henderson, Byron Boots, Michael C. Yip
The composition of elementary behaviors to solve challenging transfer learning problems is one of the key elements in building intelligent machines.
1 code implementation • ICCV 2019 • Amirreza Shaban, Amir Rahimi, Shray Bansal, Stephen Gould, Byron Boots, Richard Hartley
We model the selection as an energy minimization problem with unary and pairwise potential functions.
no code implementations • 9 Mar 2019 • Sandesh Adhikary, Siddarth Srinivasan, Byron Boots
Quantum graphical models (QGMs) extend the classical framework for reasoning about uncertainty by incorporating the quantum mechanical view of probability.
no code implementations • 24 Feb 2019 • Nolan Wagener, Ching-An Cheng, Jacob Sacks, Byron Boots
In this paper, we show that there exists a close connection between MPC and online learning, an abstract theoretical framework for analyzing online decision making in the optimization literature.
no code implementations • 19 Feb 2019 • Ching-An Cheng, Jonathan Lee, Ken Goldberg, Byron Boots
Furthermore, we show for COL a reduction from dynamic regret to both static regret and convergence in the associated EP, allowing us to analyze the dynamic regret of many existing algorithms.
1 code implementation • 14 Feb 2019 • Anqi Li, Mustafa Mukadam, Magnus Egerstedt, Byron Boots
We propose a collection of RMPs for simple multi-robot tasks that can be used for building controllers for more complicated tasks.
Robotics
1 code implementation • 16 Nov 2018 • Ching-An Cheng, Mustafa Mukadam, Jan Issac, Stan Birchfield, Dieter Fox, Byron Boots, Nathan Ratliff
We develop a novel policy synthesis algorithm, RMPflow, based on geometrically consistent transformations of Riemannian Motion Policies (RMPs).
Robotics Systems and Control
2 code implementations • NeurIPS 2018 • Brandon Amos, Ivan Dario Jimenez Rodriguez, Jacob Sacks, Byron Boots, J. Zico Kolter
We present foundations for using Model Predictive Control (MPC) as a differentiable policy class for reinforcement learning in continuous state and action spaces.
no code implementations • NeurIPS 2018 • Siddarth Srinivasan, Carlton Downey, Byron Boots
Unlike classical graphical models, QGMs represent uncertainty with density matrices in complex Hilbert spaces.
2 code implementations • 25 Oct 2018 • Amirreza Shaban, Ching-An Cheng, Nathan Hatch, Byron Boots
Bilevel optimization has been recently revisited for designing and analyzing algorithms in hyperparameter tuning and meta learning tasks.
1 code implementation • 15 Oct 2018 • Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots
We present a predictor-corrector framework, called PicCoLO, that can transform a first-order model-free reinforcement or imitation learning algorithm into a new hybrid method that leverages predictive models to accelerate policy learning.
1 code implementation • NeurIPS 2018 • Hugh Salimbeni, Ching-An Cheng, Byron Boots, Marc Deisenroth
It adopts an orthogonal basis in the mean function to model the residues that cannot be learned by the standard coupled approach.
no code implementations • ICLR 2019 • Ahmed H. Qureshi, Byron Boots, Michael C. Yip
We consider a problem of learning the reward and policy from expert examples under unknown dynamics.
no code implementations • 4 Aug 2018 • Jing Dong, Byron Boots, Frank Dellaert, Ranveer Chandra, Sudipta N. Sinha
Such descriptors are often derived using supervised learning on existing datasets with ground truth correspondences.
1 code implementation • 26 Jul 2018 • Joris Guérin, Olivier Gibaru, Eric Nyiri, Stéphane Thiery, Byron Boots
Although deep learning has facilitated progress in image understanding, a robot's performance in problems like object recognition often depends on the angle from which the object is observed.
1 code implementation • 20 Jul 2018 • Joris Guérin, Byron Boots
For many image clustering problems, replacing raw image data with features extracted by a pretrained convolutional neural network (CNN), leads to better clustering performance.
no code implementations • 12 Jun 2018 • Ching-An Cheng, Xinyan Yan, Evangelos A. Theodorou, Byron Boots
When the model oracle is learned online, these algorithms can provably accelerate the best known convergence rate up to an order.
no code implementations • ICLR 2018 • Wen Sun, J. Andrew Bagnell, Byron Boots
In this paper, we propose to combine imitation and reinforcement learning via the idea of reward shaping using an oracle.
no code implementations • NeurIPS 2018 • Wen Sun, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell
Recently, a novel class of Approximate Policy Iteration (API) algorithms have demonstrated impressive practical performance (e. g., ExIt from [2], AlphaGo-Zero from [27]).
no code implementations • 26 May 2018 • Ching-An Cheng, Xinyan Yan, Nolan Wagener, Byron Boots
We show that if the switching time is properly randomized, LOKI can learn to outperform a suboptimal expert and converge faster than running policy gradient from scratch.
no code implementations • 22 Jan 2018 • Ching-An Cheng, Byron Boots
Value aggregation is a general framework for solving imitation learning problems.
no code implementations • ICLR 2018 • Krzysztof Choromanski, Carlton Downey, Byron Boots
In this paper, we extend the theory of ORFs to Kernel Ridge Regression and show that ORFs can be used to obtain Orthogonal PSRNNs (OPSRNNs), which are smaller and faster than PSRNNs.
no code implementations • NeurIPS 2017 • Ching-An Cheng, Byron Boots
Furthermore, it yields a variational inference problem that can be solved by stochastic gradient ascent with time and space complexity that is only linear in the number of mean function parameters, regardless of the choice of kernels, likelihoods, and inducing points.
no code implementations • 31 Oct 2017 • Alexander Lambert, Amirreza Shaban, Amit Raj, Zhen Liu, Byron Boots
We consider the problems of learning forward models that map state to high-dimensional images and inverse models that map high-dimensional images to state in robotics.
no code implementations • 24 Oct 2017 • Siddarth Srinivasan, Geoff Gordon, Byron Boots
We extend previous work on HQMMs with three contributions: (1) we show how classical hidden Markov models (HMMs) can be simulated on a quantum circuit, (2) we reformulate HQMMs by relaxing the constraints for modeling HMMs on quantum circuits, and (3) we present a learning algorithm to estimate the parameters of an HQMM from data.
no code implementations • 15 Oct 2017 • Xinyan Yan, Krzysztof Choromanski, Byron Boots, Vikas Sindhwani
Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL).
no code implementations • NeurIPS 2017 • Arun Venkatraman, Nicholas Rhinehart, Wen Sun, Lerrel Pinto, Martial Hebert, Byron Boots, Kris M. Kitani, J. Andrew Bagnell
We seek to combine the advantages of RNNs and PSRs by augmenting existing state-of-the-art recurrent neural networks with Predictive-State Decoders (PSDs), which add supervision to the network's internal state representation to target predicting future observations.
no code implementations • 21 Sep 2017 • Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron Boots
We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost on-board sensors.
Robotics
8 code implementations • 11 Sep 2017 • Amirreza Shaban, Shray Bansal, Zhen Liu, Irfan Essa, Byron Boots
Low-shot learning methods for image classification support learning from sparse data.
no code implementations • ICML 2017 • Yunpeng Pan, Xinyan Yan, Evangelos A. Theodorou, Byron Boots
Sparse Spectrum Gaussian Processes (SSGPs) are a powerful tool for scaling Gaussian processes (GPs) to large datasets.
1 code implementation • 24 Jul 2017 • Mustafa Mukadam, Jing Dong, Xinyan Yan, Frank Dellaert, Byron Boots
We benchmark our algorithms against several sampling-based and trajectory optimization-based motion planning algorithms on planning problems in multiple environments.
Robotics
no code implementations • NeurIPS 2017 • Carlton Downey, Ahmed Hefny, Boyue Li, Byron Boots, Geoffrey Gordon
We present a new model, Predictive State Recurrent Neural Networks (PSRNNs), for filtering and prediction in dynamical systems.
2 code implementations • 17 May 2017 • Jing Dong, Byron Boots, Frank Dellaert
Continuous-time trajectory representations are a powerful tool that can be used to address several issues in many practical simultaneous localization and mapping (SLAM) scenarios, like continuously collected measurements distorted by robot motion, or during with asynchronous sensor measurements.
Robotics
no code implementations • ICML 2017 • Wen Sun, Arun Venkatraman, Geoffrey J. Gordon, Byron Boots, J. Andrew Bagnell
We demonstrate that AggreVaTeD --- a policy gradient extension of the Imitation Learning (IL) approach of (Ross & Bagnell, 2014) --- can leverage such an oracle to achieve faster and better solutions with less training data than a less-informed Reinforcement Learning (RL) technique.
no code implementations • NeurIPS 2016 • Ching-An Cheng, Byron Boots
Recent work on scaling up Gaussian process regression (GPR) to large datasets has primarily focused on sparse GPR, which leverages a small set of basis functions to approximate the full Gaussian process during inference.
no code implementations • 8 Oct 2016 • Jing Dong, John Gary Burnham, Byron Boots, Glen C. Rains, Frank Dellaert
Autonomous crop monitoring at high spatial and temporal resolution is a critical problem in precision agriculture.
no code implementations • 22 Aug 2016 • Yunpeng Pan, Xinyan Yan, Evangelos Theodorou, Byron Boots
Robotic systems must be able to quickly and robustly make decisions when operating in uncertain and dynamic environments.
no code implementations • 15 Jul 2016 • Bo Dai, Niao He, Yunpeng Pan, Byron Boots, Le Song
In such problems, each sample $x$ itself is associated with a conditional distribution $p(z|x)$ represented by samples $\{z_i\}_{i=1}^M$, and the goal is to learn a function $f$ that links these conditional distributions to target values $y$.
no code implementations • 30 Dec 2015 • Wen Sun, Arun Venkatraman, Byron Boots, J. Andrew Bagnell
Latent state space models are a fundamental and widely used tool for modeling dynamical systems.
no code implementations • 26 Sep 2013 • Byron Boots, Geoffrey Gordon, Arthur Gretton
The essence is to represent the state as a nonparametric conditional embedding operator in a Reproducing Kernel Hilbert Space (RKHS) and leverage recent work in kernel methods to estimate, predict, and update the representation.
no code implementations • NeurIPS 2010 • Byron Boots, Geoffrey J. Gordon
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification.
no code implementations • 6 Oct 2009 • Sajid M. Siddiqi, Byron Boots, Geoffrey J. Gordon
We introduce the Reduced-Rank Hidden Markov Model (RR-HMM), a generalization of HMMs that can model smooth state evolution as in Linear Dynamical Systems (LDSs) as well as non-log-concave predictive distributions as in continuous-observation HMMs.