Search Results for author: Yunchang Yang

Found 6 papers, 1 papers with code

A Reduction-based Framework for Sequential Decision Making with Delayed Feedback

no code implementations • NeurIPS 2023 • Yunchang Yang, Han Zhong, Tianhao Wu, Bin Liu, LiWei Wang, Simon S. Du

We study stochastic delayed feedback in general multi-agent sequential decision making, which includes bandits, single-agent Markov decision processes (MDPs), and Markov games (MGs).

Decision Making

Paper
Add Code

Nearly Optimal Policy Optimization with Stable at Any Time Guarantee

no code implementations • 21 Dec 2021 • Tianhao Wu, Yunchang Yang, Han Zhong, LiWei Wang, Simon S. Du, Jiantao Jiao

Policy optimization methods are one of the most widely used classes of Reinforcement Learning (RL) algorithms.

4k Reinforcement Learning (RL)

Paper
Add Code

A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning

no code implementations • ICLR 2022 • Yunchang Yang, Tianhao Wu, Han Zhong, Evrard Garcelon, Matteo Pirotta, Alessandro Lazaric, LiWei Wang, Simon S. Du

We also obtain a new upper bound for conservative low-rank MDP.

Multi-Armed Bandits reinforcement-learning +1

Paper
Add Code

(Locally) Differentially Private Combinatorial Semi-Bandits

no code implementations • ICML 2020 • Xiaoyu Chen, Kai Zheng, Zixin Zhou, Yunchang Yang, Wei Chen, Li-Wei Wang

In this paper, we study Combinatorial Semi-Bandits (CSB) that is an extension of classic Multi-Armed Bandits (MAB) under Differential Privacy (DP) and stronger Local Differential Privacy (LDP) setting.

Multi-Armed Bandits Privacy Preserving

Paper
Add Code

On Layer Normalization in the Transformer Architecture

8 code implementations • ICML 2020 • Ruibin Xiong, Yunchang Yang, Di He, Kai Zheng, Shuxin Zheng, Chen Xing, Huishuai Zhang, Yanyan Lan, Li-Wei Wang, Tie-Yan Liu

This motivates us to remove the warm-up stage for the training of Pre-LN Transformers.

7,560

Paper
Code

On the Anomalous Generalization of GANs

no code implementations • 27 Sep 2019 • Jinchen Xuan, Yunchang Yang, Ze Yang, Di He, Li-Wei Wang

Motivated by this observation, we discover two specific problems of GANs leading to anomalous generalization behaviour, which we refer to as the sample insufficiency and the pixel-wise combination.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.