Search Results for author: P. R. Kumar

Found 24 papers, 5 papers with code

Anchor-Changing Regularized Natural Policy Gradient for Multi-Objective Reinforcement Learning

2 code implementations • 10 Jun 2022 • Ruida Zhou, Tao Liu, Dileep Kalathil, P. R. Kumar, Chao Tian

We study policy optimization for Markov decision processes (MDPs) with multiple reward value functions, which are to be jointly optimized according to given criteria such as proportional fairness (smooth concave scalarization), hard constraints (constrained MDP), and max-min trade-off.

Fairness Multi-Objective Reinforcement Learning +1

Paper
Code

Learning from Few Samples: Transformation-Invariant SVMs with Composition and Locality at Multiple Scales

1 code implementation • 27 Sep 2021 • Tao Liu, P. R. Kumar, Ruida Zhou, Xi Liu

Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful.

Paper
Code

Terra: Blockage Resilience in Outdoor mmWave Networks

1 code implementation • 25 Sep 2022 • Santosh Ganji, Jaewon Kim, P. R. Kumar

This allows the mobile to maintain time-synchronization with the base station, allowing it to revert to the LoS path when the temporary blockage disappears.

Detect Ground Reflections

Paper
Code

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation

1 code implementation • NeurIPS 2023 • Ruida Zhou, Tao Liu, Min Cheng, Dileep Kalathil, P. R. Kumar, Chao Tian

We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Stay With Me: Lifetime Maximization Through Heteroscedastic Linear Bandits With Reneging

1 code implementation • 29 Oct 2018 • Ping-Chun Hsieh, Xi Liu, Anirban Bhattacharya, P. R. Kumar

Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection.

Decision Making Multi-Armed Bandits

Paper
Code

Throughput Optimal Decentralized Scheduling of Multi-Hop Networks with End-to-End Deadline Constraints: II Wireless Networks with Interference

no code implementations • 6 Sep 2017 • Rahul Singh, P. R. Kumar, Eytan Modiano

The key difference arises due to the fact that in our set-up the packets loose their utility once their "age" has crossed their deadline, thus making the task of optimizing timely throughput much more challenging than that of ensuring network stability.

Scheduling

Paper
Add Code

Belief Space Planning Simplified: Trajectory-Optimized LQG (T-LQG) (Extended Report)

no code implementations • 10 Aug 2016 • Mohammadhussein Rafieisakhaei, Suman Chakravorty, P. R. Kumar

Planning under motion and observation uncertainties requires solution of a stochastic control problem in the space of feedback policies.

Robotics Optimization and Control

Paper
Add Code

Exploration Through Reward Biasing: Reward-Biased Maximum Likelihood Estimation for Stochastic Multi-Armed Bandits

no code implementations • 2 Jul 2019 • Xi Liu, Ping-Chun Hsieh, Anirban Bhattacharya, P. R. Kumar

To choose the bias-growth rate $\alpha(t)$ in RBMLE, we reveal the nontrivial interplay between $\alpha(t)$ and the regret bound that generally applies in both the Exponential Family as well as the sub-Gaussian/Exponential family bandits.

Multi-Armed Bandits

Paper
Add Code

Learning in Networked Control Systems

no code implementations • 21 Mar 2020 • Rahul Singh, P. R. Kumar

We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel.

Paper
Add Code

Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

no code implementations • 8 Oct 2020 • Yu-Heng Hung, Ping-Chun Hsieh, Xi Liu, P. R. Kumar

Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems.

Computational Efficiency

Paper
Add Code

Reward Biased Maximum Likelihood Estimation for Reinforcement Learning

no code implementations • 16 Nov 2020 • Akshay Mete, Rahul Singh, Xi Liu, P. R. Kumar

The Reward-Biased Maximum Likelihood Estimate (RBMLE) for adaptive control of Markov chains was proposed to overcome the central obstacle of what is variously called the fundamental "closed-identifiability problem" of adaptive control, the "dual control problem", or, contemporaneously, the "exploration vs. exploitation problem".

Multi-Armed Bandits reinforcement-learning +2

Paper
Add Code

An Efficient Network Solver for Dynamic Simulation of Power Systems Based on Hierarchical Inverse Computation and Modification

no code implementations • 22 May 2021 • Lu Zhang, Bin Wang, Vivek Sarin, Weiping Shi, P. R. Kumar, Le Xie

In power system dynamic simulation, up to 90% of the computational time is devoted to solve the network equations, i. e., a set of linear equations.

Paper
Add Code

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

no code implementations • NeurIPS 2021 • Tao Liu, Ruida Zhou, Dileep Kalathil, P. R. Kumar, Chao Tian

We show that when a strictly safe policy is known, then one can confine the system to zero constraint violation with arbitrarily high probability while keeping the reward regret of order $\tilde{\mathcal{O}}(\sqrt{K})$.

Safe Exploration

Paper
Add Code

Silent Tracker: In-band Beam Management for Soft Handover for mm-Wave Networks

no code implementations • 18 Jul 2021 • Santosh Ganji, Tzu-Hsiang Lin, Jaewon Kim, P. R. Kumar

In mm-wave networks, cell sizes are small due to high path and penetration losses.

Management

Paper
Add Code

BeamSurfer: Minimalist Beam Management of Mobile mm-Wave Devices

no code implementations • 4 Oct 2021 • Santosh Ganji, Tzu-Hsiang Lin, Francisco A. Espinal, P. R. Kumar

Management of narrow directional beams is critical for mm-wave communication systems.

Management

Paper
Add Code

Policy Optimization for Constrained MDPs with Provable Fast Global Convergence

no code implementations • 31 Oct 2021 • Tao Liu, Ruida Zhou, Dileep Kalathil, P. R. Kumar, Chao Tian

We propose a new algorithm called policy mirror descent-primal dual (PMD-PD) algorithm that can provably achieve a faster $\mathcal{O}(\log(T)/T)$ convergence rate for both the optimality gap and the constraint violation.

Paper
Add Code

Augmented RBMLE-UCB Approach for Adaptive Control of Linear Quadratic Systems

no code implementations • 25 Jan 2022 • Akshay Mete, Rahul Singh, P. R. Kumar

We consider the problem of controlling an unknown stochastic linear system with quadratic costs - called the adaptive LQ control problem.

Thompson Sampling

Paper
Add Code

On an Information and Control Architecture for Future Electric Energy Systems

no code implementations • 1 Jun 2022 • Le Xie, Tong Huang, P. R. Kumar, Anupam A. Thatte, Sanjoy K. Mitter

This paper presents considerations towards an information and control architecture for future electric energy systems driven by massive changes resulting from the societal goals of decarbonization and electrification.

Paper
Add Code

Energy System Digitization in the Era of AI: A Three-Layered Approach towards Carbon Neutrality

no code implementations • 2 Nov 2022 • Le Xie, Tong Huang, Xiangtian Zheng, Yan Liu, Mengdi Wang, Vijay Vittal, P. R. Kumar, Srinivas Shakkottai, Yi Cui

The transition towards carbon-neutral electricity is one of the biggest game changers in addressing climate change since it addresses the dual challenges of removing carbon emissions from the two largest sectors of emitters: electricity and transportation.

Decision Making

Paper
Add Code

TERRA: Beam Management for Outdoor mm-Wave Networks

no code implementations • 10 Jan 2023 • Santosh Ganji, Jaewon Kim, Romil Sonigra, P. R. Kumar

To avoid outage in transient pedestrian blockage of the LoS path, the mobile uses reflected or NLoS path available in indoor environments.

Management

Paper
Add Code

Bounded (O(1)) Regret Recommendation Learning via Synthetic Controls Oracle

no code implementations • 29 Jan 2023 • Enoch Hyunwook Kang, P. R. Kumar

In online exploration systems where users with fixed preferences repeatedly arrive, it has recently been shown that O(1), i. e., bounded regret, can be achieved when the system is modeled as a linear contextual bandit.

Recommendation Systems

Paper
Add Code

Finite Time Regret Bounds for Minimum Variance Control of Autoregressive Systems with Exogenous Inputs

no code implementations • 26 May 2023 • Rahul Singh, Akshay Mete, Avik Kar, P. R. Kumar

Minimum variance controllers have been employed in a wide-range of industrial applications.

Paper
Add Code

Value-Biased Maximum Likelihood Estimation for Model-based Reinforcement Learning in Discounted Linear MDPs

no code implementations • 17 Oct 2023 • Yu-Heng Hung, Ping-Chun Hsieh, Akshay Mete, P. R. Kumar

We consider the infinite-horizon linear Markov Decision Processes (MDPs), where the transition probabilities of the dynamic model can be linearly parameterized with the help of a predefined low-dimensional feature mapping.

Model-based Reinforcement Learning

Paper
Add Code

Provable Policy Gradient Methods for Average-Reward Markov Potential Games

no code implementations • 9 Mar 2024 • Min Cheng, Ruida Zhou, P. R. Kumar, Chao Tian

We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion.

Policy Gradient Methods

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.