no code implementations • 12 Nov 2022 • Gianluigi Grandesso, Elisa Alboni, Gastone P. Rosati Papini, Patrick M. Wensing, Andrea Del Prete
Thus, our algorithm learns a "good" control policy via TO-guided RL policy search that, when used as initial guess provider for TO, makes the trajectory optimization process less prone to converge to poor local optima.