Search Results for author: Yaozhong Gan

Found 4 papers, 1 papers with code

Robust Action Gap Increasing with Clipped Advantage Learning

no code implementations20 Mar 2022 Zhe Zhang, Yaozhong Gan, Xiaoyang Tan

Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors.

Smoothing Advantage Learning

no code implementations20 Mar 2022 Yaozhong Gan, Zhe Zhang, Xiaoyang Tan

Advantage learning (AL) aims to improve the robustness of value-based reinforcement learning against estimation errors with action-gap-based regularization.

Stabilizing Q Learning Via Soft Mellowmax Operator

no code implementations17 Dec 2020 Yaozhong Gan, Zhe Zhang, Xiaoyang Tan

Learning complicated value functions in high dimensional state space by function approximation is a challenging task, partially due to that the max-operator used in temporal difference updates can theoretically cause instability for most linear or non-linear approximation schemes.

Multi-agent Reinforcement Learning Q-Learning

Trust Region-Guided Proximal Policy Optimization

2 code implementations NeurIPS 2019 Yuhui Wang, Hao He, Xiaoyang Tan, Yaozhong Gan

We formally show that this method not only improves the exploration ability within the trust region but enjoys a better performance bound compared to the original PPO as well.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.