no code implementations • 13 Feb 2024 • Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$?
1 code implementation • 17 Jan 2024 • Tianwei Ni, Benjamin Eysenbach, Erfan Seyedsalehi, Michel Ma, Clement Gehring, Aditya Mahajan, Pierre-Luc Bacon
These findings culminate in a set of preliminary guidelines for RL practitioners.
no code implementations • 9 Jun 2023 • Erfan Seyedsalehi, Nima Akbarzadeh, Amit Sinha, Aditya Mahajan
In spite of the large literature on reinforcement learning (RL) algorithms for partially observable Markov decision processes (POMDPs), a complete theoretical understanding is still lacking.
no code implementations • 6 Feb 2023 • Hadi Nekoei, Akilesh Badrinaaraayanan, Amit Sinha, Mohammad Amini, Janarthanan Rajendran, Aditya Mahajan, Sarath Chandar
In our proposed method, when one agent updates its policy, other agents are allowed to update their policies as well, but at a slower rate.
no code implementations • 6 Nov 2022 • Gandharv Patil, Aditya Mahajan, Doina Precup
Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods, suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP.
no code implementations • 7 Feb 2022 • Nima Akbarzadeh, Aditya Mahajan
In particular, we consider a restless bandit model, and propose a Thompson-sampling based learning algorithm which is tuned to the underlying structure of the model.
no code implementations • 20 Dec 2021 • Borna Sayedana, Mohammad Afshari, Peter E. Caines, Aditya Mahajan
These results show that switched least squares method for MJS has the same rate of convergence as least squares method for autonomous linear systems.
no code implementations • 19 Aug 2021 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
The regret bound of the algorithm was derived under a technical assumption on the induced norm of the closed loop system.
no code implementations • 18 Aug 2021 • Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network.
no code implementations • 29 Jun 2021 • Anirudha Jitani, Aditya Mahajan, Zhongwen Zhu, Hatem Abou-zeid, Emmanuel T. Fapi, Hakimeh Purmehdi
Mobile Edge Computing (MEC) refers to the concept of placing computational capability and applications at the edge of the network, providing benefits such as reduced latency in handling client requests, reduced network congestion, and improved performance of applications.
no code implementations • 12 Apr 2021 • Nima Akbarzadeh, Aditya Mahajan
We consider the restless bandits with general state space under partial observability with two observational models: first, the state of each bandit is not observable at all, and second, the state of each bandit is observable only if it is chosen.
no code implementations • 9 Nov 2020 • Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang
We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.
1 code implementation • 17 Oct 2020 • Jayakumar Subramanian, Amit Sinha, Raihan Seraj, Aditya Mahajan
Our key result is to show that if a function of the history (called approximate information state (AIS)) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program.
no code implementations • 13 Aug 2020 • Nima Akbarzadeh, Aditya Mahajan
We then revisit a previously proposed algorithm called adaptive greedy algorithm which is known to compute the Whittle index for a subclass of restless bandits.
no code implementations • 24 Apr 2020 • Mohammad Afshari, Aditya Mahajan
It is shown that major agent's optimal control action is a linear function of the major agent's MMSE (minimum mean squared error) estimate of the system state while the minor agent's optimal control action is a linear function of the major agent's MMSE estimate of the system state and a "correction term" which depends on the difference of the minor agent's MMSE estimate of its local state and the major agent's MMSE estimate of the minor agent's local state.
no code implementations • 28 Mar 2019 • Mohammad Afshari, Aditya Mahajan
We derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error.
no code implementations • 3 Apr 2018 • Jayakumar Subramanian, Aditya Mahajan
We generalize the RMC algorithm to post-decision state models and also present a variant that converges faster to an approximately optimal policy.
no code implementations • 31 Mar 2017 • Jhelum Chakravorty, Aditya Mahajan
Sufficient conditions are identified under which the value function and the optimal strategy of a Markov decision process (MDP) are even and quasi-convex in the state.
Optimization and Control
no code implementations • 17 Apr 2013 • Aditya Mahajan, M. S. Dahiya, H. P. Sanghvi
This paper focuses on conducting forensic data analysis of 2 widely used IMs applications on Android phones WhatsApp and Viber.
Computers and Society Cryptography and Security
no code implementations • 8 Sep 2012 • Ashutosh Nayyar, Aditya Mahajan, Demosthenis Teneketzis
A general model of decentralized stochastic control called partial history sharing information structure is presented.
Systems and Control Optimization and Control