Search Results for author: Aditya Mahajan

Found 20 papers, 2 papers with code

Model approximation in MDPs with unbounded per-step cost

no code implementations13 Feb 2024 Berk Bozkurt, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

How well does an optimal policy $\hat{\pi}^{\star}$ of the approximate model perform when used in the original model $\mathcal{M}$?

Approximate information state based convergence analysis of recurrent Q-learning

no code implementations9 Jun 2023 Erfan Seyedsalehi, Nima Akbarzadeh, Amit Sinha, Aditya Mahajan

In spite of the large literature on reinforcement learning (RL) algorithms for partially observable Markov decision processes (POMDPs), a complete theoretical understanding is still lacking.

Q-Learning Reinforcement Learning (RL)

On learning history based policies for controlling Markov decision processes

no code implementations6 Nov 2022 Gandharv Patil, Aditya Mahajan, Doina Precup

Reinforcementlearning(RL)folkloresuggeststhathistory-basedfunctionapproximationmethods, suchas recurrent neural nets or history-based state abstraction, perform better than their memory-less counterparts, due to the fact that function approximation in Markov decision processes (MDP) can be viewed as inducing a Partially observable MDP.

Continuous Control

On learning Whittle index policy for restless bandits with scalable regret

no code implementations7 Feb 2022 Nima Akbarzadeh, Aditya Mahajan

In particular, we consider a restless bandit model, and propose a Thompson-sampling based learning algorithm which is tuned to the underlying structure of the model.

Scheduling Thompson Sampling

Strong Consistency and Rate of Convergence of Switched Least Squares System Identification for Autonomous Markov Jump Linear Systems

no code implementations20 Dec 2021 Borna Sayedana, Mohammad Afshari, Peter E. Caines, Aditya Mahajan

These results show that switched least squares method for MJS has the same rate of convergence as least squares method for autonomous linear systems.

Scalable regret for learning to control network-coupled subsystems with unknown dynamics

no code implementations18 Aug 2021 Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

We consider the problem of controlling an unknown linear quadratic Gaussian (LQG) system consisting of multiple subsystems connected over a network.

Thompson Sampling

Structure-aware reinforcement learning for node-overload protection in mobile edge computing

no code implementations29 Jun 2021 Anirudha Jitani, Aditya Mahajan, Zhongwen Zhu, Hatem Abou-zeid, Emmanuel T. Fapi, Hakimeh Purmehdi

Mobile Edge Computing (MEC) refers to the concept of placing computational capability and applications at the edge of the network, providing benefits such as reduced latency in handling client requests, reduced network congestion, and improved performance of applications.

Edge-computing reinforcement-learning +1

Two families of indexable partially observable restless bandits and Whittle index computation

no code implementations12 Apr 2021 Nima Akbarzadeh, Aditya Mahajan

We consider the restless bandits with general state space under partial observability with two observational models: first, the state of each bandit is not observable at all, and second, the state of each bandit is observable only if it is chosen.

Thompson sampling for linear quadratic mean-field teams

no code implementations9 Nov 2020 Mukul Gagrani, Sagar Sudhakara, Aditya Mahajan, Ashutosh Nayyar, Yi Ouyang

We consider optimal control of an unknown multi-agent linear quadratic (LQ) system where the dynamics and the cost are coupled across the agents through the mean-field (i. e., empirical mean) of the states and controls.

Thompson Sampling

Approximate information state for approximate planning and reinforcement learning in partially observed systems

1 code implementation17 Oct 2020 Jayakumar Subramanian, Amit Sinha, Raihan Seraj, Aditya Mahajan

Our key result is to show that if a function of the history (called approximate information state (AIS)) approximately satisfies the properties of the information state, then there is a corresponding approximate dynamic program.

reinforcement-learning Reinforcement Learning (RL)

Conditions for indexability of restless bandits and an O(K^3) algorithm to compute Whittle index

no code implementations13 Aug 2020 Nima Akbarzadeh, Aditya Mahajan

We then revisit a previously proposed algorithm called adaptive greedy algorithm which is known to compute the Whittle index for a subclass of restless bandits.

Decentralized linear quadratic systems with major and minor agents and non-Gaussian noise

no code implementations24 Apr 2020 Mohammad Afshari, Aditya Mahajan

It is shown that major agent's optimal control action is a linear function of the major agent's MMSE (minimum mean squared error) estimate of the system state while the minor agent's optimal control action is a linear function of the major agent's MMSE estimate of the system state and a "correction term" which depends on the difference of the minor agent's MMSE estimate of its local state and the major agent's MMSE estimate of the minor agent's local state.

Multi-agent estimation and filtering for minimizing team mean-squared error

no code implementations28 Mar 2019 Mohammad Afshari, Aditya Mahajan

We derive closed-form expressions for MTMSE estimates, which are linear function of the observations where the corresponding gain depends on the weight matrix that couples the estimation error.

Autonomous Vehicles

Renewal Monte Carlo: Renewal theory based reinforcement learning

no code implementations3 Apr 2018 Jayakumar Subramanian, Aditya Mahajan

We generalize the RMC algorithm to post-decision state models and also present a variant that converges faster to an approximately optimal policy.

Management reinforcement-learning +1

Sufficient conditions for the value function and optimal strategy to be even and quasi-convex

no code implementations31 Mar 2017 Jhelum Chakravorty, Aditya Mahajan

Sufficient conditions are identified under which the value function and the optimal strategy of a Markov decision process (MDP) are even and quasi-convex in the state.

Optimization and Control

Forensic Analysis of Instant Messenger Applications on Android Devices

no code implementations17 Apr 2013 Aditya Mahajan, M. S. Dahiya, H. P. Sanghvi

This paper focuses on conducting forensic data analysis of 2 widely used IMs applications on Android phones WhatsApp and Viber.

Computers and Society Cryptography and Security

Decentralized Stochastic Control with Partial History Sharing: A Common Information Approach

no code implementations8 Sep 2012 Ashutosh Nayyar, Aditya Mahajan, Demosthenis Teneketzis

A general model of decentralized stochastic control called partial history sharing information structure is presented.

Systems and Control Optimization and Control

Cannot find the paper you are looking for? You can Submit a new open access paper.