We present a general, two-stage reinforcement learning approach to create robust policies that can be deployed on real robots without any additional training using a single demonstration generated by trajectory optimization.
In this work, we introduce a generic and scalable method based on multiple shooting to learn latent representations of indirectly observed dynamical systems.
We need intelligent robots for mobile construction, the process of navigating in an environment and modifying its structure according to a geometric design.
We need intelligent robots to perform mobile construction, the process of moving in an environment and modifying its geometry according to a design plan.
We consider failing behaviors as those that violate a constraint and address the problem of learning with crash constraints, where no data is obtained upon constraint violation.
In this approach, instead of adding a (conservative) terminal constraint to the problem, we propose to use the measured state projected to the viability kernel in the OCP solved at each control cycle.
2 code implementations • 8 Aug 2020 • Manuel Wüthrich, Felix Widmaier, Felix Grimminger, Joel Akpo, Shruti Joshi, Vaibhav Agrawal, Bilal Hammoud, Majid Khadiv, Miroslav Bogdanovic, Vincent Berenz, Julian Viereck, Maximilien Naveau, Ludovic Righetti, Bernhard Schölkopf, Stefan Bauer
Dexterous object manipulation remains an open problem in robotics, despite the rapid progress in machine learning during the past decade.
1 code implementation • 30 Sep 2019 • Felix Grimminger, Avadesh Meduri, Majid Khadiv, Julian Viereck, Manuel Wüthrich, Maximilien Naveau, Vincent Berenz, Steve Heim, Felix Widmaier, Thomas Flayols, Jonathan Fiene, Alexander Badri-Spröwitz, Ludovic Righetti
Finally, to demonstrate the capabilities of the quadruped, we present a novel controller which combines feedforward contact forces computed from a kino-dynamic optimizer with impedance control of the center of mass and base orientation.
We present a meta-learning method for learning parametric loss functions that can generalize across different tasks and model architectures.
2 code implementations • 11 Sep 2019 • Carlos Mastalli, Rohan Budhiraja, Wolfgang Merkt, Guilhem Saurel, Bilal Hammoud, Maximilien Naveau, Justin Carpentier, Ludovic Righetti, Sethu Vijayakumar, Nicolas Mansard
Additionally, we propose a novel optimal control algorithm called Feasibility-driven Differential Dynamic Programming (FDDP).
Robotics Optimization and Control
We propose learning a policy giving as output impedance and desired position in joint space and compare the performance of that approach to torque and position control under different contact uncertainties.
This information shapes the learned loss function such that the environment does not need to provide this information during meta-test time.
In this work, we propose a model-based reinforcement learning (MBRL) framework that combines Bayesian modeling of the system dynamics with curious iLQR, an iterative LQR approach that considers model uncertainty.
While it is possible to learn grasping policies without contact sensing, our results suggest that contact feedback allows for a significant improvement of grasping robustness under object pose uncertainty and for objects with a complex shape.
This model is non-linear and non-convex; however, we find a relaxation of the problem that allows us to formulate it as a single convex quadratically-constrained quadratic program (QCQP) that can be very efficiently optimized.
In this paper, we derive a probabilistic registration algorithm for object modeling and tracking.