Search Results for author: Honghao Wei

Found 8 papers, 1 papers with code

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

2 code implementations • 4 Oct 2020 • Honghao Wei, Lei Ying

In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms.

reinforcement-learning Reinforcement Learning (RL)

598

Paper
Code

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

no code implementations • 4 Mar 2019 • Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying

The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stopping algorithm to detect misinformation.

Misinformation

Paper
Add Code

A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes

no code implementations • 3 Jun 2021 • Honghao Wei, Xin Liu, Lei Ying

This paper presents the first model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

Paper
Add Code

Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems

no code implementations • 13 Dec 2022 • Xin Liu, Honghao Wei, Lei Ying

The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Provably Efficient Model-Free Algorithms for Non-stationary CMDPs

no code implementations • 10 Mar 2023 • Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, Xingyu Zhou

We study model-free reinforcement learning (RL) algorithms in episodic non-stationary constrained Markov Decision Processes (CMDPs), in which an agent aims to maximize the expected cumulative reward subject to a cumulative constraint on the expected utility (cost).

Reinforcement Learning (RL)

Paper
Add Code

Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs

no code implementations • 27 Sep 2023 • Zihan Zhou, Honghao Wei, Lei Ying

PRI achieves trio objectives: (i) PRI is a model-free algorithm; and (ii) it outputs an approximately optimal policy with a high probability at the end of learning; and (iii) PRI guarantees $\tilde{\mathcal{O}}(H\sqrt{K})$ regret and constraint violation, which significantly improves the best existing regret bound $\tilde{\mathcal{O}}(H^4 \sqrt{SA}K^{\frac{4}{5}})$ under a model-free algorithm, where $H$ is the length of each episode, $S$ is the number of states, $A$ is the number of actions, and the total number of episodes during learning is $2K+\tilde{\cal O}(K^{0. 25}).$ We further present a matching lower via an example that shows under any online learning algorithm, there exists a well-separated CMDP instance such that either the regret or violation has to be $\Omega(H\sqrt{K}),$ which matches the upper bound by a polylogarithmic factor.

Paper
Add Code

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

no code implementations • 22 Dec 2023 • Honghao Wei, Xin Liu, Lei Ying

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step.

4k reinforcement-learning +1

Paper
Add Code

Adversarially Trained Actor Critic for offline CMDPs

no code implementations • 1 Jan 2024 • Honghao Wei, Xiyue Peng, Xin Liu, Arnob Ghosh

Theoretically, we demonstrate that when the actor employs a no-regret optimization oracle, SATAC achieves two guarantees: (i) For the first time in the offline RL setting, we establish that SATAC can produce a policy that outperforms the behavior policy while maintaining the same level of safety, which is critical to designing an algorithm for offline RL.

Continuous Control Offline RL +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.