1 code implementation • 4 May 2019 • Mingzhang Yin, Yuguang Yue, Mingyuan Zhou
To address the challenge of backpropagating the gradient through categorical variables, we propose the augment-REINFORCE-swap-merge (ARSM) gradient estimator that is unbiased and has low variance.
3 code implementations • NeurIPS 2020 • Yuguang Yue, Zhendong Wang, Mingyuan Zhou
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution.
1 code implementation • 10 Feb 2020 • Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou
Reinforcement learning (RL) in discrete action space is ubiquitous in real-world applications, but its complexity grows exponentially with the action-space dimension, making it challenging to apply existing on-policy gradient based deep RL algorithms efficiently.
no code implementations • 17 Oct 2019 • Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
In this paper, we provide a framework with provable guarantees for selecting hyperparameters in a number of distinct models.
no code implementations • 18 Oct 2019 • Wenyuan Li, Zichen Wang, Yuguang Yue, Jiayun Li, William Speier, Mingyuan Zhou, Corey W. Arnold
In this work, we investigate semi-supervised learning (SSL) for image classification using adversarial training.
no code implementations • ICML 2020 • Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
Tuning hyperparameters for unsupervised learning problems is difficult in general due to the lack of ground truth for validation.
no code implementations • 19 Jan 2022 • Yuguang Yue, Yuanpu Xie, Huasen Wu, Haofeng Jia, Shaodan Zhai, Wenzhe Shi, Jonathan J Hunt
Listwise ranking losses have been widely studied in recommender systems.