no code implementations • 11 Dec 2021 • Navid Rezazadeh, Maxwell Kolarich, Solmaz S. Kia, Negar Mehr
We then learn both the control policy and the contraction metric such that the distance between the trajectories from the offline data set and our generated auxiliary sample trajectories decreases over time.