In this article, we sketch an algorithm that extends the Q-learning algorithms to the continuous action space domain. Our method is based on the discretization of the action space... (read more)
PDFMETHOD | TYPE | |
---|---|---|
![]() |
Off-Policy TD Control |