2 code implementations • 10 Jun 2022 • Ruida Zhou, Tao Liu, Dileep Kalathil, P. R. Kumar, Chao Tian
We study policy optimization for Markov decision processes (MDPs) with multiple reward value functions, which are to be jointly optimized according to given criteria such as proportional fairness (smooth concave scalarization), hard constraints (constrained MDP), and max-min trade-off.
1 code implementation • 27 Sep 2021 • Tao Liu, P. R. Kumar, Ruida Zhou, Xi Liu
Motivated by the problem of learning with small sample sizes, this paper shows how to incorporate into support-vector machines (SVMs) those properties that have made convolutional neural networks (CNNs) successful.
1 code implementation • 25 Sep 2022 • Santosh Ganji, Jaewon Kim, P. R. Kumar
This allows the mobile to maintain time-synchronization with the base station, allowing it to revert to the LoS path when the temporary blockage disappears.
1 code implementation • NeurIPS 2023 • Ruida Zhou, Tao Liu, Min Cheng, Dileep Kalathil, P. R. Kumar, Chao Tian
We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment.
1 code implementation • 29 Oct 2018 • Ping-Chun Hsieh, Xi Liu, Anirban Bhattacharya, P. R. Kumar
Sequential decision making for lifetime maximization is a critical problem in many real-world applications, such as medical treatment and portfolio selection.
no code implementations • 6 Sep 2017 • Rahul Singh, P. R. Kumar, Eytan Modiano
The key difference arises due to the fact that in our set-up the packets loose their utility once their "age" has crossed their deadline, thus making the task of optimizing timely throughput much more challenging than that of ensuring network stability.
no code implementations • 10 Aug 2016 • Mohammadhussein Rafieisakhaei, Suman Chakravorty, P. R. Kumar
Planning under motion and observation uncertainties requires solution of a stochastic control problem in the space of feedback policies.
Robotics Optimization and Control
no code implementations • 2 Jul 2019 • Xi Liu, Ping-Chun Hsieh, Anirban Bhattacharya, P. R. Kumar
To choose the bias-growth rate $\alpha(t)$ in RBMLE, we reveal the nontrivial interplay between $\alpha(t)$ and the regret bound that generally applies in both the Exponential Family as well as the sub-Gaussian/Exponential family bandits.
no code implementations • 21 Mar 2020 • Rahul Singh, P. R. Kumar
We design adaptive controller (learning rule) for a networked control system (NCS) in which data packets containing control information are transmitted across a lossy wireless channel.
no code implementations • 8 Oct 2020 • Yu-Heng Hung, Ping-Chun Hsieh, Xi Liu, P. R. Kumar
Modifying the reward-biased maximum likelihood method originally proposed in the adaptive control literature, we propose novel learning algorithms to handle the explore-exploit trade-off in linear bandits problems as well as generalized linear bandits problems.
no code implementations • 16 Nov 2020 • Akshay Mete, Rahul Singh, Xi Liu, P. R. Kumar
The Reward-Biased Maximum Likelihood Estimate (RBMLE) for adaptive control of Markov chains was proposed to overcome the central obstacle of what is variously called the fundamental "closed-identifiability problem" of adaptive control, the "dual control problem", or, contemporaneously, the "exploration vs. exploitation problem".
no code implementations • 22 May 2021 • Lu Zhang, Bin Wang, Vivek Sarin, Weiping Shi, P. R. Kumar, Le Xie
In power system dynamic simulation, up to 90% of the computational time is devoted to solve the network equations, i. e., a set of linear equations.
no code implementations • NeurIPS 2021 • Tao Liu, Ruida Zhou, Dileep Kalathil, P. R. Kumar, Chao Tian
We show that when a strictly safe policy is known, then one can confine the system to zero constraint violation with arbitrarily high probability while keeping the reward regret of order $\tilde{\mathcal{O}}(\sqrt{K})$.
no code implementations • 18 Jul 2021 • Santosh Ganji, Tzu-Hsiang Lin, Jaewon Kim, P. R. Kumar
In mm-wave networks, cell sizes are small due to high path and penetration losses.
no code implementations • 4 Oct 2021 • Santosh Ganji, Tzu-Hsiang Lin, Francisco A. Espinal, P. R. Kumar
Management of narrow directional beams is critical for mm-wave communication systems.
no code implementations • 31 Oct 2021 • Tao Liu, Ruida Zhou, Dileep Kalathil, P. R. Kumar, Chao Tian
We propose a new algorithm called policy mirror descent-primal dual (PMD-PD) algorithm that can provably achieve a faster $\mathcal{O}(\log(T)/T)$ convergence rate for both the optimality gap and the constraint violation.
no code implementations • 25 Jan 2022 • Akshay Mete, Rahul Singh, P. R. Kumar
We consider the problem of controlling an unknown stochastic linear system with quadratic costs - called the adaptive LQ control problem.
no code implementations • 1 Jun 2022 • Le Xie, Tong Huang, P. R. Kumar, Anupam A. Thatte, Sanjoy K. Mitter
This paper presents considerations towards an information and control architecture for future electric energy systems driven by massive changes resulting from the societal goals of decarbonization and electrification.
no code implementations • 2 Nov 2022 • Le Xie, Tong Huang, Xiangtian Zheng, Yan Liu, Mengdi Wang, Vijay Vittal, P. R. Kumar, Srinivas Shakkottai, Yi Cui
The transition towards carbon-neutral electricity is one of the biggest game changers in addressing climate change since it addresses the dual challenges of removing carbon emissions from the two largest sectors of emitters: electricity and transportation.
no code implementations • 10 Jan 2023 • Santosh Ganji, Jaewon Kim, Romil Sonigra, P. R. Kumar
To avoid outage in transient pedestrian blockage of the LoS path, the mobile uses reflected or NLoS path available in indoor environments.
no code implementations • 29 Jan 2023 • Enoch Hyunwook Kang, P. R. Kumar
In online exploration systems where users with fixed preferences repeatedly arrive, it has recently been shown that O(1), i. e., bounded regret, can be achieved when the system is modeled as a linear contextual bandit.
no code implementations • 26 May 2023 • Rahul Singh, Akshay Mete, Avik Kar, P. R. Kumar
Minimum variance controllers have been employed in a wide-range of industrial applications.
no code implementations • 17 Oct 2023 • Yu-Heng Hung, Ping-Chun Hsieh, Akshay Mete, P. R. Kumar
We consider the infinite-horizon linear Markov Decision Processes (MDPs), where the transition probabilities of the dynamic model can be linearly parameterized with the help of a predefined low-dimensional feature mapping.
no code implementations • 9 Mar 2024 • Min Cheng, Ruida Zhou, P. R. Kumar, Chao Tian
We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion.