Search Results for author: Hsin-En Su

Found 1 papers, 0 papers with code

Coordinate Ascent for Off-Policy RL with Global Convergence Guarantees

no code implementations • 10 Dec 2022 • Hsin-En Su, Yen-ju Chen, Ping-Chun Hsieh, Xi Liu

In this paper, we rethink off-policy learning via Coordinate Ascent Policy Optimization (CAPO), an off-policy actor-critic algorithm that decouples policy improvement from the state distribution of the behavior policy without using the policy gradient.

counterfactual

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.