You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 13 Oct 2023 • Zhongchang Sun, Shaofeng Zou

The data-driven setting where the disturbance signal parameters are unknown is further investigated, and an online and computationally efficient gradient ascent CuSum algorithm is designed.

1 code implementation • 30 Jul 2023 • Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

Then we propose a robust multi-agent Q-learning (RMAQ) algorithm to find such an equilibrium, with convergence guarantees.

no code implementations • 22 May 2023 • Yue Wang, JinJun Xiong, Shaofeng Zou

We show that an improved sample complexity of $\mathcal{O}(SC^{\pi^*}\epsilon^{-2}(1-\gamma)^{-3})$ can be obtained, which matches with the minimax lower bound for offline reinforcement learning, and thus is minimax optimal.

no code implementations • 17 May 2023 • Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs.

no code implementations • 2 Jan 2023 • Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy.

no code implementations • 21 Oct 2022 • Qi Zhang, Zhongchang Sun, Luis C. Herrera, Shaofeng Zou

The WADD is at most of the order of the logarithm of the ARL.

no code implementations • 17 Sep 2022 • Sihong He, Yue Wang, Shuo Han, Shaofeng Zou, Fei Miao

In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with state transition kernel uncertainty for EV AMoD systems.

no code implementations • 14 Sep 2022 • Yue Wang, Fei Miao, Shaofeng Zou

We then investigate a concrete example of $\delta$-contamination uncertainty set, design an online and model-free algorithm and theoretically characterize its sample complexity.

no code implementations • 6 Sep 2022 • Yue Wang, Yi Zhou, Shaofeng Zou

Our techniques in this paper provide a general approach for finite-sample analysis of non-convex two timescale value-based reinforcement learning algorithms.

no code implementations • 4 Sep 2022 • Zhongchang Sun, Shaofeng Zou

The goal of the fusion center is to detect the anomaly with minimal detection delay subject to false alarm constraints.

no code implementations • 13 Jun 2022 • Tengyu Xu, Yue Wang, Shaofeng Zou, Yingbin Liang

The remarkable success of reinforcement learning (RL) heavily relies on observing the reward of every visited state-action pair.

no code implementations • 15 May 2022 • Yue Wang, Shaofeng Zou

We further develop a smoothed robust policy gradient method and show that to achieve an $\epsilon$-global optimum, the complexity is $\mathcal O(\epsilon^{-3})$.

no code implementations • 23 Mar 2022 • Zhongchang Sun, Shaofeng Zou

For the Bayesian setting where the goal is to minimize the worst-case error probability, an optimal test is firstly obtained when the alphabet is finite.

no code implementations • 26 Feb 2022 • Zhongchang Sun, Shaofeng Zou, Ruizhi Zhang, Qunwei Li

The problem of quickest change detection (QCD) in anonymous heterogeneous sensor networks is studied.

no code implementations • 20 Oct 2021 • Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan

Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge to the global optimum with a complexity of $\tilde{\mathcal O}(1/\epsilon)$ in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approach by a factor of $\mathcal O(1/\epsilon)$ \citep{ding2020natural, paternain2019constrained}.

no code implementations • NeurIPS 2021 • Yue Wang, Shaofeng Zou

In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a misspecified MDP that generates a single sample trajectory sequentially and is assumed to be unknown.

no code implementations • 8 Sep 2021 • Ziyi Chen, Yi Zhou, Rongrong Chen, Shaofeng Zou

Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy.

no code implementations • NeurIPS 2021 • Yue Wang, Shaofeng Zou, Yi Zhou

Temporal-difference learning with gradient correction (TDC) is a two time-scale algorithm for policy evaluation in reinforcement learning.

no code implementations • ICLR 2021 • Shaocong Ma, Ziyi Chen, Yi Zhou, Shaofeng Zou

Greedy-GQ is a value-based reinforcement learning (RL) algorithm for optimal control.

no code implementations • 7 Dec 2020 • Qunwei Li, Shaofeng Zou, Wenliang Zhong

Two types of GNNs are investigated, depending on whether labels are attached to nodes or graphs.

no code implementations • NeurIPS 2020 • Shaocong Ma, Yi Zhou, Shaofeng Zou

In the Markovian setting, our algorithm achieves the state-of-the-art sample complexity $O(\epsilon^{-1} \log {\epsilon}^{-1})$ that is near-optimal.

no code implementations • 20 May 2020 • Yue Wang, Shaofeng Zou

Greedy-GQ is an off-policy two timescale algorithm for optimal control in reinforcement learning.

1 code implementation • Conference 2019 • Shaofeng Zou, Mingzhu Long, Xuyang Wang, Xiang Xie, Guolin Li, Zhihua Wang

The number of iterations is reduced about 36% by using transfer learning in our DIP process.

no code implementations • NeurIPS 2019 • Tengyu Xu, Shaofeng Zou, Yingbin Liang

Gradient-based temporal difference (GTD) algorithms are widely used in off-policy learning scenarios.

no code implementations • NeurIPS 2019 • Shaofeng Zou, Tengyu Xu, Yingbin Liang

For this fitted SARSA algorithm, we also provide its finite-sample analysis.

1 code implementation • 27 Jan 2019 • Yuheng Bu, Weihao Gao, Shaofeng Zou, Venugopal V. Veeravalli

We show that model compression can improve the population risk of a pre-trained model, by studying the tradeoff between the decrease in the generalization error and the increase in the empirical risk with model compression.

no code implementations • 15 Jan 2019 • Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli

The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error.

no code implementations • 21 Jan 2017 • Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli

A sequence is considered as outlying if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences.

no code implementations • 5 Apr 2016 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor

Sufficient conditions on minimum and maximum sizes of candidate anomalous intervals are characterized in order to guarantee the proposed test to be consistent.

no code implementations • 25 Apr 2014 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor, Xinghua Shi

samples drawn from a distribution p, whereas each anomalous sequence contains m i. i. d.

no code implementations • 1 Apr 2014 • Shaofeng Zou, Yingbin Liang, H. Vincent Poor

If anomalous interval does not exist, then all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary, and are unknown.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.