# On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning

10 Feb 2020Che WangKeith Ross

A simple and natural algorithm for reinforcement learning is Monte Carlo Exploring States (MCES), where the Q-function is estimated by averaging the Monte Carlo returns, and the policy is improved by choosing actions that maximize the current estimate of the Q-function. Exploration is performed by "exploring starts", that is, each episode begins with a randomly chosen state and action and then follows the current policy... (read more)

