It is well known that the extension of Watkins' algorithm to general function approximation settings is challenging: does the projected Bellman equation have a solution? If so, is the solution useful in the sense of generating a good policy?.. (read more)

PDF
Submit
results from this paper
to get state-of-the-art GitHub badges and help the
community compare results to other papers.