no code implementations • 11 Feb 2019 • Nikki Lijing Kuang, Clement H. C. Leung, Vienne W. K. Sung
In reinforcement learning episodes, the rewards and punishments are often non-deterministic, and there are invariably stochastic elements governing the underlying situation.