Search Results for author: Ness Shroff

Found 24 papers, 1 papers with code

Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate

no code implementations21 Feb 2024 Yuchen Liang, Peizhong Ju, Yingbin Liang, Ness Shroff

In this paper, we establish the convergence guarantee for substantially larger classes of distributions under discrete-time diffusion models and further improve the convergence rate for distributions with bounded support.

Denoising

Model-Free Change Point Detection for Mixing Processes

no code implementations14 Dec 2023 Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

In particular, we provide performance guarantees for the MMD-CUSUM test under $\alpha$, $\beta$, and $\phi$-mixing processes, which significantly expands its utility beyond the i. i. d.

Change Point Detection

Hoeffding's Inequality for Markov Chains under Generalized Concentrability Condition

no code implementations4 Oct 2023 Hao Chen, Abhishek Gupta, Yin Sun, Ness Shroff

This paper studies Hoeffding's inequality for Markov chains under the generalized concentrability condition defined via integral probability metric (IPM).

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping

no code implementations22 Jun 2023 Yining Li, Peizhong Ju, Ness Shroff

To address this issue, we formulate a general optimization problem for determining the optimal grouping strategy, which strikes a balance between performance loss and sample/computational complexity.

Theoretical Hardness and Tractability of POMDPs in RL with Partial Online State Information

no code implementations14 Jun 2023 Ming Shi, Yingbin Liang, Ness Shroff

However, existing theoretical results have shown that learning in POMDPs is intractable in the worst case, where the main challenge lies in the lack of latent state information.

Provably Efficient Model-Free Algorithms for Non-stationary CMDPs

no code implementations10 Mar 2023 Honghao Wei, Arnob Ghosh, Ness Shroff, Lei Ying, Xingyu Zhou

We study model-free reinforcement learning (RL) algorithms in episodic non-stationary constrained Markov Decision Processes (CMDPs), in which an agent aims to maximize the expected cumulative reward subject to a cumulative constraint on the expected utility (cost).

Reinforcement Learning (RL)

Theory on Forgetting and Generalization of Continual Learning

no code implementations12 Feb 2023 Sen Lin, Peizhong Ju, Yingbin Liang, Ness Shroff

In particular, there is a lack of understanding on what factors are important and how they affect "catastrophic forgetting" and generalization performance.

Continual Learning

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

no code implementations8 Feb 2023 Ming Shi, Yingbin Liang, Ness Shroff

In many applications of Reinforcement Learning (RL), it is critically important that the algorithm performs safely, such that instantaneous hard constraints are satisfied at each step, and unsafe states and actions are avoided.

reinforcement-learning Reinforcement Learning (RL) +1

Near-Optimal Adversarial Reinforcement Learning with Switching Costs

no code implementations8 Feb 2023 Ming Shi, Yingbin Liang, Ness Shroff

Our lower bound indicates that, due to the fundamental challenge of switching costs in adversarial RL, the best achieved regret (whose dependency on $T$ is $\tilde{O}(\sqrt{T})$) in static RL with switching costs (as well as adversarial RL without switching costs) is no longer achievable.

reinforcement-learning Reinforcement Learning (RL)

Provably Efficient Model-Free Constrained RL with Linear Function Approximation

no code implementations23 Jun 2022 Arnob Ghosh, Xingyu Zhou, Ness Shroff

To this end, we consider the episodic constrained Markov decision processes with linear function approximation, where the transition dynamics and the reward function can be represented as a linear function of some known feature mapping.

Sample Complexity Bounds for Active Ranking from Multi-wise Comparisons

1 code implementation NeurIPS 2021 Wenbo Ren, Jia Liu, Ness Shroff

Here, a multi-wise comparison takes $m$ items as input and returns a (noisy) result about the best item (the winner feedback) or the order of these items (the full-ranking feedback).

Adaptive Control of Differentially Private Linear Quadratic Systems

no code implementations26 Aug 2021 Sayak Ray Chowdhury, Xingyu Zhou, Ness Shroff

In this paper, we study the problem of regret minimization in reinforcement learning (RL) under differential privacy constraints.

Reinforcement Learning (RL)

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

No-Regret Algorithms for Time-Varying Bayesian Optimization

no code implementations11 Feb 2021 Xingyu Zhou, Ness Shroff

In this paper, we consider the time-varying Bayesian optimization problem.

Bayesian Optimization

Bandit Policies for Reliable Cellular Network Handovers in Extreme Mobility

no code implementations28 Oct 2020 Yuanjie Li, Esha Datta, Jiaxin Ding, Ness Shroff, Xin Liu

The demand for seamless Internet access under extreme user mobility, such as on high-speed trains and vehicles, has become a norm rather than an exception.

Thompson Sampling

Multi-Armed Bandits with Dependent Arms

no code implementations13 Oct 2020 Rahul Singh, Fang Liu, Yin Sun, Ness Shroff

We study a variant of the classical multi-armed bandit problem (MABP) which we call as Multi-Armed Bandits with dependent arms.

Multi-Armed Bandits

Contextual Bandits with Side-Observations

no code implementations6 Jun 2020 Rahul Singh, Fang Liu, Xin Liu, Ness Shroff

We show that this asymptotically optimal regret is upper-bounded as $O\left(|\chi(\mathcal{G})|\log T\right)$, where $|\chi(\mathcal{G})|$ is the domination number of $\mathcal{G}$.

Multi-Armed Bandits

Data Poisoning Attacks on Stochastic Bandits

no code implementations16 May 2019 Fang Liu, Ness Shroff

Then we study a form of online attacks on bandit algorithms and propose an adaptive attack strategy against any bandit algorithm without the knowledge of the bandit algorithm.

Data Poisoning Multi-Armed Bandits +1

Exploring $k$ out of Top $ρ$ Fraction of Arms in Stochastic Bandits

no code implementations28 Oct 2018 Wenbo Ren, Jia Liu, Ness Shroff

Results in this paper provide up to $\rho n/k$ reductions compared with the "$k$-exploration" algorithms that focus on finding the (PAC) best $k$ arms out of $n$ arms.

Analysis of Thompson Sampling for Graphical Bandits Without the Graphs

no code implementations23 May 2018 Fang Liu, Zizhan Zheng, Ness Shroff

To fill this gap, we propose a variant of Thompson Sampling, that attains the optimal regret in the directed setting within a logarithmic factor.

Thompson Sampling

UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits

no code implementations16 Apr 2018 Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

We show that UCBoost($D$) enjoys $O(1)$ complexity for each arm per round as well as regret guarantee that is $1/e$-close to that of the kl-UCB algorithm.

Decision Making

A New Alternating Direction Method for Linear Programming

no code implementations NeurIPS 2017 Sinong Wang, Ness Shroff

It is well known that, for a linear program (LP) with constraint matrix $\mathbf{A}\in\mathbb{R}^{m\times n}$, the Alternating Direction Method of Multiplier converges globally and linearly at a rate $O((\|\mathbf{A}\|_F^2+mn)\log(1/\epsilon))$.

A Change-Detection based Framework for Piecewise-stationary Multi-Armed Bandit Problem

no code implementations8 Nov 2017 Fang Liu, Joohyun Lee, Ness Shroff

The multi-armed bandit problem has been extensively studied under the stationary assumption.

Change Detection

Information Directed Sampling for Stochastic Bandits with Graph Feedback

no code implementations8 Nov 2017 Fang Liu, Swapna Buccapatnam, Ness Shroff

We consider stochastic multi-armed bandit problems with graph feedback, where the decision maker is allowed to observe the neighboring actions of the chosen action.

Decision Making Thompson Sampling

Cannot find the paper you are looking for? You can Submit a new open access paper.