Search Results for author: Luobao Zou

Found 1 papers, 0 papers with code

Soft policy optimization using dual-track advantage estimator

no code implementations15 Sep 2020 Yubo Huang, Xuechun Wang, Luobao Zou, Zhiwei Zhuang, Weidong Zhang

In reinforcement learning (RL), we always expect the agent to explore as many states as possible in the initial stage of training and exploit the explored information in the subsequent stage to discover the most returnable trajectory.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.