Search Results for author: Liangyu Zhang

Found 7 papers, 2 papers with code

Near Minimax-Optimal Distributional Temporal Difference Algorithms and The Freedman Inequality in Hilbert Spaces

no code implementations9 Mar 2024 Yang Peng, Liangyu Zhang, Zhihua Zhang

In the tabular case, \citet{rowland2018analysis} and \citet{rowland2023analysis} proved the asymptotic convergence of two instances of distributional TD, namely categorical temporal difference algorithm (CTD) and quantile temporal difference algorithm (QTD), respectively.

Distributional Reinforcement Learning

Semi-Infinitely Constrained Markov Decision Processes and Efficient Reinforcement Learning

1 code implementation29 Apr 2023 Liangyu Zhang, Yang Peng, Wenhao Yang, Zhihua Zhang

To the best of our knowledge, we are the first to apply tools from semi-infinitely programming (SIP) to solve constrained reinforcement learning problems.

Decision Making Model-based Reinforcement Learning +1

Statistical Estimation of Confounded Linear MDPs: An Instrumental Variable Approach

no code implementations12 Sep 2022 Miao Lu, Wenhao Yang, Liangyu Zhang, Zhihua Zhang

Specifically, we propose a two-stage estimator based on the instrumental variables and establish its statistical properties in the confounded MDPs with a linear structure.

Off-policy evaluation

Towards Theoretical Understandings of Robust Markov Decision Processes: Sample Complexity and Asymptotics

no code implementations9 May 2021 Wenhao Yang, Liangyu Zhang, Zhihua Zhang

In this paper, we study the non-asymptotic and asymptotic performances of the optimal robust policy and value function of robust Markov Decision Processes(MDPs), where the optimal robust policy and value function are solved only from a generative model.

Intervention Generative Adversarial Nets

no code implementations1 Jan 2021 Jiadong Liang, Liangyu Zhang, Cheng Zhang, Zhihua Zhang

In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem.

Intervention Generative Adversarial Networks

no code implementations9 Aug 2020 Jiadong Liang, Liangyu Zhang, Cheng Zhang, Zhihua Zhang

In this paper we propose a novel approach for stabilizing the training process of Generative Adversarial Networks as well as alleviating the mode collapse problem.

Cannot find the paper you are looking for? You can Submit a new open access paper.