no code implementations • 25 May 2019 • Marc Brittain, Josh Bertram, Xuxi Yang, Peng Wei
Experience replay is widely used in deep reinforcement learning algorithms and allows agents to remember and learn from experiences from the past.
no code implementations • 9 Jun 2018 • Josh Bertram, Peng Wei
We present a method for a certain class of Markov Decision Processes (MDPs) that can relate the optimal policy back to one or more reward sources in the environment.