This paper investigates the use of Reinforcement Learning for the robust design of low-thrust interplanetary trajectories in presence of severe disturbances, modeled alternatively as Gaussian additive process noise, observation noise, control actuation errors on thrust magnitude and direction, and possibly multiple missed thrust events. The optimal control problem is recast as a time-discrete Markov Decision Process to comply with the standard formulation of reinforcement learning... (read more)
PDFMETHOD | TYPE | |
---|---|---|
![]() |
Policy Gradient Methods |