Search Results for author: Yuguang Yue

Found 7 papers, 3 papers with code

ARSM: Augment-REINFORCE-Swap-Merge Estimator for Gradient Backpropagation Through Categorical Variables

1 code implementation • 4 May 2019 • Mingzhang Yin, Yuguang Yue, Mingyuan Zhou

To address the challenge of backpropagating the gradient through categorical variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient estimator that is unbiased and has low variance.

Paper
Code

Implicit Distributional Reinforcement Learning

3 code implementations • NeurIPS 2020 • Yuguang Yue, Zhendong Wang, Mingyuan Zhou

To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution.

Distributional Reinforcement Learning OpenAI Gym +2

Paper
Code

Discrete Action On-Policy Learning with Action-Value Critic

1 code implementation • 10 Feb 2020 • Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms efficiently.

OpenAI Gym Reinforcement Learning (RL)