Search Results for author: Shaofeng Zou

Found 41 papers, 4 papers with code

Independently-Normalized SGD for Generalized-Smooth Nonconvex Optimization

no code implementations17 Oct 2024 Yufeng Yang, Erin Tripp, Yifan Sun, Shaofeng Zou, Yi Zhou

Recent studies have shown that many nonconvex machine learning problems meet a so-called generalized-smooth condition that extends beyond traditional smooth nonconvex optimization.

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

no code implementations24 Jun 2024 Yudan Wang, Shaofeng Zou, Yue Wang

We develop algorithms for uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases.

reinforcement-learning Reinforcement Learning

Non-Asymptotic Analysis for Single-Loop (Natural) Actor-Critic with Compatible Function Approximation

no code implementations3 Jun 2024 Yudan Wang, Yue Wang, Yi Zhou, Shaofeng Zou

Specifically, existing studies show that AC converges to an $\epsilon+\varepsilon_{\text{critic}}$ neighborhood of stationary points with the best known sample complexity of $\mathcal{O}(\epsilon^{-2})$ (up to a log factor), and NAC converges to an $\epsilon+\varepsilon_{\text{critic}}+\sqrt{\varepsilon_{\text{actor}}}$ neighborhood of the global optimum with the best known sample complexity of $\mathcal{O}(\epsilon^{-3})$, where $\varepsilon_{\text{critic}}$ is the approximation error of the critic and $\varepsilon_{\text{actor}}$ is the approximation error induced by the insufficient expressive power of the parameterized policy class.

MGDA Converges under Generalized Smoothness, Provably

no code implementations29 May 2024 Qi Zhang, Peiyao Xiao, Shaofeng Zou, Kaiyi Ji

We provide a comprehensive convergence analysis of these algorithms and show that they converge to an $\epsilon$-accurate Pareto stationary point with a guaranteed $\epsilon$-level average CA distance (i. e., the gap between the updating direction and the CA direction) over all iterations, where totally $\mathcal{O}(\epsilon^{-2})$ and $\mathcal{O}(\epsilon^{-4})$ samples are needed for deterministic and stochastic settings, respectively.

Multi-Task Learning

Finite-Time Analysis for Conflict-Avoidant Multi-Task Reinforcement Learning

no code implementations25 May 2024 Yudan Wang, Peiyao Xiao, Hao Ban, Kaiyi Ji, Shaofeng Zou

However, these methods often suffer from the issue of \textit{gradient conflict} such that the tasks with larger gradients dominate the update direction, resulting in a performance degeneration on other tasks.

reinforcement-learning Reinforcement Learning

Constrained Reinforcement Learning Under Model Mismatch

no code implementations2 May 2024 Zhongchang Sun, Sihong He, Fei Miao, Shaofeng Zou

Existing studies on constrained reinforcement learning (RL) may obtain a well-performing policy in the training environment.

reinforcement-learning Reinforcement Learning +1

Convergence Guarantees for RMSProp and Adam in Generalized-smooth Non-convex Optimization with Affine Noise Variance

no code implementations1 Apr 2024 Qi Zhang, Yi Zhou, Shaofeng Zou

Specifically, to solve the challenges due to dependence among adaptive update, unbounded gradient estimate and Lipschitz constant, we demonstrate that the first-order term in the descent lemma converges and its denominator is upper bounded by a function of gradient norm.

LEMMA

Large-Scale Non-convex Stochastic Constrained Distributionally Robust Optimization

no code implementations1 Apr 2024 Qi Zhang, Yi Zhou, Ashley Prater-Bennette, Lixin Shen, Shaofeng Zou

We prove that our algorithm finds an $\epsilon$-stationary point with a computational complexity of $\mathcal O(\epsilon^{-3k_*-5})$, where $k_*$ is the parameter of the Cressie-Read divergence.

Sample Complexity Characterization for Linear Contextual MDPs

no code implementations5 Feb 2024 Junze Deng, Yuan Cheng, Shaofeng Zou, Yingbin Liang

Our result for the second model is the first-known result for such a type of function approximation models.

Quickest Change Detection in Autoregressive Models

no code implementations13 Oct 2023 Zhongchang Sun, Shaofeng Zou

The data-driven setting where the disturbance signal parameters are unknown is further investigated, and an online and computationally efficient gradient ascent CuSum algorithm is designed.

Change Detection

Robust Multi-Agent Reinforcement Learning with State Uncertainty

1 code implementation30 Jul 2023 Sihong He, Songyang Han, Sanbao Su, Shuo Han, Shaofeng Zou, Fei Miao

Then we propose a robust multi-agent Q-learning (RMAQ) algorithm to find such an equilibrium, with convergence guarantees.

Multi-agent Reinforcement Learning Q-Learning +3

Achieving the Asymptotically Optimal Sample Complexity of Offline Reinforcement Learning: A DRO-Based Approach

no code implementations22 May 2023 Yue Wang, JinJun Xiong, Shaofeng Zou

We show that an improved sample complexity of $\mathcal{O}(SC^{\pi^*}\epsilon^{-2}(1-\gamma)^{-3})$ can be obtained, which asymptotically matches with the minimax lower bound for offline reinforcement learning, and thus is asymptotically minimax optimal.

reinforcement-learning

Model-Free Robust Average-Reward Reinforcement Learning

no code implementations17 May 2023 Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

Robust Markov decision processes (MDPs) address the challenge of model uncertainty by optimizing the worst-case performance over an uncertainty set of MDPs.

Q-Learning reinforcement-learning +1

Robust Average-Reward Markov Decision Processes

no code implementations2 Jan 2023 Yue Wang, Alvaro Velasquez, George Atia, Ashley Prater-Bennette, Shaofeng Zou

We derive the robust Bellman equation for robust average-reward MDPs, prove that the optimal policy can be derived from its solution, and further design a robust relative value iteration algorithm that provably finds its solution, or equivalently, the optimal robust policy.

What is the Solution for State-Adversarial Multi-Agent Reinforcement Learning?

1 code implementation6 Dec 2022 Songyang Han, Sanbao Su, Sihong He, Shuo Han, Haizhao Yang, Shaofeng Zou, Fei Miao

Various methods for Multi-Agent Reinforcement Learning (MARL) have been developed with the assumption that agents' policies are based on accurate state information.

Deep Reinforcement Learning Multi-agent Reinforcement Learning +2

A Robust and Constrained Multi-Agent Reinforcement Learning Electric Vehicle Rebalancing Method in AMoD Systems

no code implementations17 Sep 2022 Sihong He, Yue Wang, Shuo Han, Shaofeng Zou, Fei Miao

In this work, we design a robust and constrained multi-agent reinforcement learning (MARL) framework with state transition kernel uncertainty for EV AMoD systems.

Fairness Multi-agent Reinforcement Learning +1

Robust Constrained Reinforcement Learning

no code implementations14 Sep 2022 Yue Wang, Fei Miao, Shaofeng Zou

We then investigate a concrete example of $\delta$-contamination uncertainty set, design an online and model-free algorithm and theoretically characterize its sample complexity.

Adversarial Attack reinforcement-learning +2

Finite-Time Error Bounds for Greedy-GQ

no code implementations6 Sep 2022 Yue Wang, Yi Zhou, Shaofeng Zou

Our finite-time error bounds match with one of the stochastic gradient descent algorithms for general smooth non-convex optimization problems, despite its additonal challenge in the two time-scale updates.

reinforcement-learning Reinforcement Learning +1

Quickest Anomaly Detection in Sensor Networks With Unlabeled Samples

no code implementations4 Sep 2022 Zhongchang Sun, Shaofeng Zou

The goal of the fusion center is to detect the anomaly with minimal detection delay subject to false alarm constraints.

Anomaly Detection

Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward

no code implementations13 Jun 2022 Tengyu Xu, Yue Wang, Shaofeng Zou, Yingbin Liang

The remarkable success of reinforcement learning (RL) heavily relies on observing the reward of every visited state-action pair.

Offline RL reinforcement-learning +2

Policy Gradient Method For Robust Reinforcement Learning

no code implementations15 May 2022 Yue Wang, Shaofeng Zou

We further develop a smoothed robust policy gradient method and show that to achieve an $\epsilon$-global optimum, the complexity is $\mathcal O(\epsilon^{-3})$.

reinforcement-learning Reinforcement Learning +1

Kernel Robust Hypothesis Testing

no code implementations23 Mar 2022 Zhongchang Sun, Shaofeng Zou

For the Bayesian setting where the goal is to minimize the worst-case error probability, an optimal test is firstly obtained when the alphabet is finite.

Quickest Change Detection in Anonymous Heterogeneous Sensor Networks

no code implementations26 Feb 2022 Zhongchang Sun, Shaofeng Zou, Ruizhi Zhang, Qunwei Li

The problem of quickest change detection (QCD) in anonymous heterogeneous sensor networks is studied.

Change Detection

Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process

no code implementations20 Oct 2021 Tianjiao Li, Ziwei Guan, Shaofeng Zou, Tengyu Xu, Yingbin Liang, Guanghui Lan

Despite the challenge of the nonconcave objective subject to nonconcave constraints, the proposed approach is shown to converge to the global optimum with a complexity of $\tilde{\mathcal O}(1/\epsilon)$ in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approach by a factor of $\mathcal O(1/\epsilon)$ \citep{ding2020natural, paternain2019constrained}.

Online Robust Reinforcement Learning with Model Uncertainty

no code implementations NeurIPS 2021 Yue Wang, Shaofeng Zou

In this paper, we focus on model-free robust RL, where the uncertainty set is defined to be centering at a misspecified MDP that generates a single sample trajectory sequentially and is assumed to be unknown.

Q-Learning reinforcement-learning +2

Sample and Communication-Efficient Decentralized Actor-Critic Algorithms with Finite-Time Analysis

no code implementations8 Sep 2021 Ziyi Chen, Yi Zhou, Rongrong Chen, Shaofeng Zou

Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy.

Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

no code implementations NeurIPS 2021 Yue Wang, Shaofeng Zou, Yi Zhou

Temporal-difference learning with gradient correction (TDC) is a two time-scale algorithm for policy evaluation in reinforcement learning.

reinforcement-learning Reinforcement Learning +1

Learning Graph Neural Networks with Approximate Gradient Descent

no code implementations7 Dec 2020 Qunwei Li, Shaofeng Zou, Wenliang Zhong

Two types of GNNs are investigated, depending on whether labels are attached to nodes or graphs.

Variance-Reduced Off-Policy TDC Learning: Non-Asymptotic Convergence Analysis

no code implementations NeurIPS 2020 Shaocong Ma, Yi Zhou, Shaofeng Zou

In the Markovian setting, our algorithm achieves the state-of-the-art sample complexity $O(\epsilon^{-1} \log {\epsilon}^{-1})$ that is near-optimal.

Two Time-scale Off-Policy TD Learning: Non-asymptotic Analysis over Markovian Samples

no code implementations NeurIPS 2019 Tengyu Xu, Shaofeng Zou, Yingbin Liang

Gradient-based temporal difference (GTD) algorithms are widely used in off-policy learning scenarios.

Information-Theoretic Understanding of Population Risk Improvement with Model Compression

1 code implementation27 Jan 2019 Yuheng Bu, Weihao Gao, Shaofeng Zou, Venugopal V. Veeravalli

We show that model compression can improve the population risk of a pre-trained model, by studying the tradeoff between the decrease in the generalization error and the increase in the empirical risk with model compression.

Clustering Model Compression

Tightening Mutual Information Based Bounds on Generalization Error

no code implementations15 Jan 2019 Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli

The bound is derived under more general conditions on the loss function than in existing studies; nevertheless, it provides a tighter characterization of the generalization error.

Linear-Complexity Exponentially-Consistent Tests for Universal Outlying Sequence Detection

no code implementations21 Jan 2017 Yuheng Bu, Shaofeng Zou, Venugopal V. Veeravalli

A sequence is considered as outlying if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences.

Clustering

Nonparametric Detection of Geometric Structures over Networks

no code implementations5 Apr 2016 Shaofeng Zou, Yingbin Liang, H. Vincent Poor

Sufficient conditions on minimum and maximum sizes of candidate anomalous intervals are characterized in order to guarantee the proposed test to be consistent.

Nonparametric Detection of Anomalous Data Streams

no code implementations25 Apr 2014 Shaofeng Zou, Yingbin Liang, H. Vincent Poor, Xinghua Shi

samples drawn from a distribution p, whereas each anomalous sequence contains m i. i. d.

Two-sample testing

A Kernel-Based Nonparametric Test for Anomaly Detection over Line Networks

no code implementations1 Apr 2014 Shaofeng Zou, Yingbin Liang, H. Vincent Poor

If anomalous interval does not exist, then all nodes receive samples generated by p. It is assumed that the distributions p and q are arbitrary, and are unknown.

Anomaly Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.