Search Results for author: Lei Ying

Found 24 papers, 4 papers with code

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

2 code implementations4 Oct 2020 Honghao Wei, Lei Ying

In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms.

reinforcement-learning Reinforcement Learning (RL)

Reconstructing Graph Diffusion History from a Single Snapshot

1 code implementation1 Jun 2023 Ruizhong Qiu, Dingsu Wang, Lei Ying, H. Vincent Poor, Yifang Zhang, Hanghang Tong

They are exclusively based on the maximum likelihood estimation (MLE) formulation and require to know true diffusion parameters.

The Mean-Squared Error of Double Q-Learning

1 code implementation NeurIPS 2020 Wentao Weng, Harsh Gupta, Niao He, Lei Ying, R. Srikant

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning.

Q-Learning

Collaborative Filtering with Information-Rich and Information-Sparse Entities

no code implementations6 Mar 2014 Kai Zhu, Rui Wu, Lei Ying, R. Srikant

In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users).

Clustering Collaborative Filtering +1

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

no code implementations1 Oct 2013 Jiaming Xu, Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, Lei Ying

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure.

Clustering

Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

no code implementations3 Feb 2019 R. Srikant, Lei Ying

We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i. e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE).

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

no code implementations4 Mar 2019 Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying

The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stopping algorithm to detect misinformation.

Misinformation

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

1 code implementation NeurIPS 2019 Harsh Gupta, R. Srikant, Lei Ying

We study two time-scale linear stochastic approximation algorithms, which can be used to model well-known reinforcement learning algorithms such as GTD, GTD2, and TDC.

reinforcement-learning Reinforcement Learning (RL)

POND: Pessimistic-Optimistic oNline Dispatching

no code implementations20 Oct 2020 Xin Liu, Bin Li, Pengyi Shi, Lei Ying

This paper considers constrained online dispatching with unknown arrival, reward and constraint distributions.

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

no code implementations NeurIPS 2021 Xin Liu, Bin Li, Pengyi Shi, Lei Ying

Thus, the overall computational complexity of our algorithm is similar to that of the linear UCB for unconstrained stochastic linear bandits.

A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes

no code implementations3 Jun 2021 Honghao Wei, Xin Liu, Lei Ying

This paper presents the first model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning

no code implementations13 Nov 2021 Jueming Hu, Xuxi Yang, Weichang Wang, Peng Wei, Lei Ying, Yongming Liu

Obstacle avoidance for small unmanned aircraft is vital for the safety of future urban air mobility (UAM) and Unmanned Aircraft System (UAS) Traffic Management (UTM).

Continuous Control Management +2

Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment

no code implementations26 May 2022 Zixian Yang, Xin Liu, Lei Ying

To understand the exploration, exploitation, and engagement in these systems, we propose a new model, called MAB-A where "A" stands for abandonment and the abandonment probability depends on the current recommended item and the user's past experience (called state).

Multi-Armed Bandits Q-Learning +1

Will Bilevel Optimizers Benefit from Loops

no code implementations27 May 2022 Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying

Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.

Bilevel Optimization Computational Efficiency

Learning While Scheduling in Multi-Server Systems with Unknown Statistics: MaxWeight with Discounted UCB

no code implementations2 Sep 2022 Zixian Yang, R. Srikant, Lei Ying

We prove that under our algorithm the asymptotic average queue length is bounded by one divided by the traffic slackness, which is order-wise optimal.

Scheduling

Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems

no code implementations13 Dec 2022 Xin Liu, Honghao Wei, Lei Ying

The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information.

Multi-agent Reinforcement Learning reinforcement-learning +1

Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach

no code implementations4 Jan 2023 Kaiyi Ji, Lei Ying

In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility.

Bilevel Optimization

On the Global Convergence of Risk-Averse Policy Gradient Methods with Expected Conditional Risk Measures

no code implementations26 Jan 2023 Xian Yu, Lei Ying

Risk-sensitive reinforcement learning (RL) has become a popular tool to control the risk of uncertain outcomes and ensure reliable performance in various sequential decision-making problems.

Decision Making Policy Gradient Methods +1

Online Nonstochastic Control with Adversarial and Static Constraints

no code implementations5 Feb 2023 Xin Liu, Zixian Yang, Lei Ying

This subroutine also achieves the state-of-the-art regret and constraint violation bounds for constrained online convex optimization problems, which is of independent interest.

Provably Efficient Model-Free Algorithms for Non-stationary CMDPs

no code implementations10 Mar 2023 Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, Xingyu Zhou

We study model-free reinforcement learning (RL) algorithms in episodic non-stationary constrained Markov Decision Processes (CMDPs), in which an agent aims to maximize the expected cumulative reward subject to a cumulative constraint on the expected utility (cost).

Reinforcement Learning (RL)

Model-Free, Regret-Optimal Best Policy Identification in Online CMDPs

no code implementations27 Sep 2023 Zihan Zhou, Honghao Wei, Lei Ying

This paper considers the best policy identification (BPI) problem in online Constrained Markov Decision Processes (CMDPs).

Safe Reinforcement Learning with Instantaneous Constraints: The Role of Aggressive Exploration

no code implementations22 Dec 2023 Honghao Wei, Xin Liu, Lei Ying

This paper studies safe Reinforcement Learning (safe RL) with linear function approximation and under hard instantaneous constraints where unsafe actions must be avoided at each step.

reinforcement-learning Safe Reinforcement Learning

Cost Aware Best Arm Identification

no code implementations26 Feb 2024 Kellen Kanarios, Qining Zhang, Lei Ying

In this paper, we study a best arm identification problem with dual objects.

Learning-Based Pricing and Matching for Two-Sided Queues

no code implementations17 Mar 2024 Zixian Yang, Lei Ying

We prove that our proposed algorithm yields a sublinear regret $\tilde{O}(T^{5/6})$ and queue-length bound $\tilde{O}(T^{2/3})$, where $T$ is the time horizon.

Cannot find the paper you are looking for? You can Submit a new open access paper.