Search Results for author: Wenhao Yang

Found 13 papers, 2 papers with code

Avoiding Model Estimation in Robust Markov Decision Processes with a Generative Model

no code implementations2 Feb 2023 Wenhao Yang, Han Wang, Tadashi Kozuno, Scott M. Jordan, Zhihua Zhang

To remove the need for an oracle, we first transform the original robust MDPs into an alternative form, as the alternative form allows us to use stochastic gradient methods to solve the robust MDPs.

Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach

no code implementations12 Sep 2022 Miao Lu, Wenhao Yang, Liangyu Zhang, Zhihua Zhang

Specifically, we propose a two-stage estimator based on the instrumental variables and establish its statistical properties in the confounded MDPs with a linear structure.

Off-policy evaluation

Pluralistic Image Completion with Probabilistic Mixture-of-Experts

no code implementations18 May 2022 Xiaobo Xia, Wenhao Yang, Jie Ren, Yewen Li, Yibing Zhan, Bo Han, Tongliang Liu

Second, the constraints for diversity are designed to be task-agnostic, which causes the constraints to not work well.

Federated Reinforcement Learning with Environment Heterogeneity

1 code implementation6 Apr 2022 Hao Jin, Yang Peng, Wenhao Yang, Shusen Wang, Zhihua Zhang

We study a Federated Reinforcement Learning (FedRL) problem in which $n$ agents collaboratively learn a single policy without sharing the trajectories they collected during agent-environment interaction.

reinforcement-learning Reinforcement Learning (RL)

AnomMAN: Detect Anomaly on Multi-view Attributed Networks

no code implementations8 Jan 2022 Ling-Hao Chen, He Li, Wenhao Yang

In fact, it remains a challenging task to consider all different kinds of interaction actions uniformly and detect anomalous instances in multi-view attributed networks.

Anomaly Detection

A Statistical Analysis of Polyak-Ruppert Averaged Q-learning

no code implementations29 Dec 2021 Xiang Li, Wenhao Yang, Jiadong Liang, Zhihua Zhang, Michael I. Jordan

We study Q-learning with Polyak-Ruppert averaging in a discounted Markov decision process in synchronous and tabular settings.


Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics

no code implementations9 May 2021 Wenhao Yang, Liangyu Zhang, Zhihua Zhang

In this paper, we study the non-asymptotic and asymptotic performances of the optimal robust policy and value function of robust Markov Decision Processes(MDPs), where the optimal robust policy and value function are solved only from a generative model.

Communication-Efficient Local Decentralized SGD Methods

no code implementations21 Oct 2019 Xiang Li, Wenhao Yang, Shusen Wang, Zhihua Zhang

Recently, the technique of local updates is a powerful tool in centralized settings to improve communication efficiency via periodical communication.

Distributed Computing

On the Convergence of FedAvg on Non-IID Data

2 code implementations ICLR 2020 Xiang Li, Kaixuan Huang, Wenhao Yang, Shusen Wang, Zhihua Zhang

In this paper, we analyze the convergence of \texttt{FedAvg} on non-iid data and establish a convergence rate of $\mathcal{O}(\frac{1}{T})$ for strongly convex and smooth problems, where $T$ is the number of SGDs.

Edge-computing Federated Learning

A Regularized Approach to Sparse Optimal Policy in Reinforcement Learning

no code implementations NeurIPS 2019 Xiang Li, Wenhao Yang, Zhihua Zhang

We propose and study a general framework for regularized Markov decision processes (MDPs) where the goal is to find an optimal policy that maximizes the expected discounted total reward plus a policy regularization term.

reinforcement-learning Reinforcement Learning (RL)

Accelerated Value Iteration via Anderson Mixing

no code implementations27 Sep 2018 YuJun Li, Chengzhuo Ni, Guangzeng Xie, Wenhao Yang, Shuchang Zhou, Zhihua Zhang

A2VI is more efficient than the modified policy iteration, which is a classical approximate method for policy evaluation.

Atari Games Q-Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.