Search Results for author: Lei Ying

Found 17 papers, 3 papers with code

On the Global Convergence of Risk-Averse Policy Gradient Methods with Dynamic Time-Consistent Risk Measures

no code implementations26 Jan 2023 Xian Yu, Lei Ying

Risk-sensitive reinforcement learning (RL) has become a popular tool to control the risk of uncertain outcomes and ensure reliable performance in various sequential decision-making problems.

Decision Making Policy Gradient Methods

Network Utility Maximization with Unknown Utility Functions: A Distributed, Data-Driven Bilevel Optimization Approach

no code implementations4 Jan 2023 Kaiyi Ji, Lei Ying

In this paper, we provide a new solution using a distributed and data-driven bilevel optimization approach, where the lower level is a distributed network utility maximization (NUM) algorithm with concave surrogate utility functions, and the upper level is a data-driven learning algorithm to find the best surrogate utility functions that maximize the sum of true network utility.

Bilevel Optimization

Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems

no code implementations13 Dec 2022 Xin Liu, Honghao Wei, Lei Ying

The proposed algorithm is distributed in two aspects: (i) the learned policy is a distributed policy that maps a local state of an agent to its local action and (ii) the learning/training is distributed, during which each agent updates its policy based on its own and neighbors' information.

Multi-agent Reinforcement Learning reinforcement-learning +1

MaxWeight With Discounted UCB: A Provably Stable Scheduling Policy for Nonstationary Multi-Server Systems With Unknown Statistics

no code implementations2 Sep 2022 Zixian Yang, R. Srikant, Lei Ying

Simulation results confirm that the proposed algorithm can stabilize the queues and that it outperforms MaxWeight with empirical mean and MaxWeight with discounted empirical mean.

Scheduling

Will Bilevel Optimizers Benefit from Loops

no code implementations27 May 2022 Kaiyi Ji, Mingrui Liu, Yingbin Liang, Lei Ying

Existing studies in the literature cover only some of those implementation choices, and the complexity bounds available are not refined enough to enable rigorous comparison among different implementations.

Bilevel Optimization

Exploration, Exploitation, and Engagement in Multi-Armed Bandits with Abandonment

no code implementations26 May 2022 Zixian Yang, Xin Liu, Lei Ying

To understand the exploration, exploitation, and engagement in these systems, we propose a new model, called MAB-A where "A" stands for abandonment and the abandonment probability depends on the current recommended item and the user's past experience (called state).

Multi-Armed Bandits Q-Learning +1

Obstacle Avoidance for UAS in Continuous Action Space Using Deep Reinforcement Learning

no code implementations13 Nov 2021 Jueming Hu, Xuxi Yang, Weichang Wang, Peng Wei, Lei Ying, Yongming Liu

Obstacle avoidance for small unmanned aircraft is vital for the safety of future urban air mobility (UAM) and Unmanned Aircraft System (UAS) Traffic Management (UTM).

Continuous Control Management +2

A Provably-Efficient Model-Free Algorithm for Constrained Markov Decision Processes

no code implementations3 Jun 2021 Honghao Wei, Xin Liu, Lei Ying

This paper presents the first model-free, simulator-free reinforcement learning algorithm for Constrained Markov Decision Processes (CMDPs) with sublinear regret and zero constraint violation.

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

no code implementations NeurIPS 2021 Xin Liu, Bin Li, Pengyi Shi, Lei Ying

Thus, the overall computational complexity of our algorithm is similar to that of the linear UCB for unconstrained stochastic linear bandits.

POND: Pessimistic-Optimistic oNline Dispatching

no code implementations20 Oct 2020 Xin Liu, Bin Li, Pengyi Shi, Lei Ying

This paper considers constrained online dispatching with unknown arrival, reward and constraint distributions.

FORK: A Forward-Looking Actor For Model-Free Reinforcement Learning

2 code implementations4 Oct 2020 Honghao Wei, Lei Ying

In this paper, we propose a new type of Actor, named forward-looking Actor or FORK for short, for Actor-Critic algorithms.

reinforcement-learning reinforcement Learning

The Mean-Squared Error of Double Q-Learning

1 code implementation NeurIPS 2020 Wentao Weng, Harsh Gupta, Niao He, Lei Ying, R. Srikant

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning.

Q-Learning

Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning

1 code implementation NeurIPS 2019 Harsh Gupta, R. Srikant, Lei Ying

We study two time-scale linear stochastic approximation algorithms, which can be used to model well-known reinforcement learning algorithms such as GTD, GTD2, and TDC.

reinforcement-learning reinforcement Learning

QuickStop: A Markov Optimal Stopping Approach for Quickest Misinformation Detection

no code implementations4 Mar 2019 Honghao Wei, Xiaohan Kang, Weina Wang, Lei Ying

The algorithm consists of an offline machine learning algorithm for learning the probabilistic information spreading model and an online optimal stopping algorithm to detect misinformation.

Misinformation

Finite-Time Error Bounds For Linear Stochastic Approximation and TD Learning

no code implementations3 Feb 2019 R. Srikant, Lei Ying

We consider the dynamics of a linear stochastic approximation algorithm driven by Markovian noise, and derive finite-time bounds on the moments of the error, i. e., deviation of the output of the algorithm from the equilibrium point of an associated ordinary differential equation (ODE).

Collaborative Filtering with Information-Rich and Information-Sparse Entities

no code implementations6 Mar 2014 Kai Zhu, Rui Wu, Lei Ying, R. Srikant

In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users).

Collaborative Filtering Recommendation Systems

Jointly Clustering Rows and Columns of Binary Matrices: Algorithms and Trade-offs

no code implementations1 Oct 2013 Jiaming Xu, Rui Wu, Kai Zhu, Bruce Hajek, R. Srikant, Lei Ying

In standard clustering problems, data points are represented by vectors, and by stacking them together, one forms a data matrix with row or column cluster structure.

Cannot find the paper you are looking for? You can Submit a new open access paper.