Search Results for author: Joseph Lubars

Found 3 papers, 1 papers with code

The Role of Lookahead and Approximate Policy Evaluation in Reinforcement Learning with Linear Value Function Approximation

no code implementations28 Sep 2021 Anna Winnicki, Joseph Lubars, Michael Livesay, R. Srikant

Therefore, techniques such as lookahead for policy improvement and m-step rollout for policy evaluation are used in practice to improve the performance of approximate dynamic programming with function approximation.

Optimistic Policy Iteration for MDPs with Acyclic Transient State Structure

no code implementations29 Jan 2021 Joseph Lubars, Anna Winnicki, Michael Livesay, R. Srikant

We consider Markov Decision Processes (MDPs) in which every stationary policy induces the same graph structure for the underlying Markov chain and further, the graph has the following property: if we replace each recurrent class by a node, then the resulting graph is acyclic.

Cannot find the paper you are looking for? You can Submit a new open access paper.