Search Results for author: Yaozhong Gan

Found 4 papers, 1 papers with code

Smoothing Advantage Learning

no code implementations • 20 Mar 2022 • Yaozhong Gan, Zhe Zhang, Xiaoyang Tan

Advantage learning (AL) aims to improve the robustness of value-based reinforcement learning against estimation errors with action-gap-based regularization.

Paper
Add Code

Robust Action Gap Increasing with Clipped Advantage Learning

no code implementations • 20 Mar 2022 • Zhe Zhang, Yaozhong Gan, Xiaoyang Tan

Advantage Learning (AL) seeks to increase the action gap between the optimal action and its competitors, so as to improve the robustness to estimation errors.

Paper
Add Code

Stabilizing Q Learning Via Soft Mellowmax Operator

no code implementations • 17 Dec 2020 • Yaozhong Gan, Zhe Zhang, Xiaoyang Tan

Learning complicated value functions in high dimensional state space by function approximation is a challenging task, partially due to that the max-operator used in temporal difference updates can theoretically cause instability for most linear or non-linear approximation schemes.

Multi-agent Reinforcement Learning Q-Learning

Paper
Add Code

Trust Region-Guided Proximal Policy Optimization

2 code implementations • NeurIPS 2019 • Yuhui Wang, Hao He, Xiaoyang Tan, Yaozhong Gan

We formally show that this method not only improves the exploration ability within the trust region but enjoys a better performance bound compared to the original PPO as well.

Reinforcement Learning (RL)

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.