Search Results for author: Ather Gattami

Found 6 papers, 0 papers with code

Deep Reinforcement Learning: A Convex Optimization Approach

no code implementations29 Feb 2024 Ather Gattami

That is, when the number of episodes goes to infinity, there exists a constant $C$ such that \[\|w-w^\star\| \le C\cdot\frac{\rho}{T}.\] In particular, our algorithm converges arbitrarily close to the optimal neural network parameters as the time horizon increases or as the regularization parameter decreases.

reinforcement-learning

Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds

no code implementations27 Jan 2023 Johan Östman, Ather Gattami, Daniel Gillblad

We consider a decentralized multiplayer game, played over $T$ rounds, with a leader-follower hierarchy described by a directed acyclic graph.

Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints

no code implementations10 Jun 2020 Qinbo Bai, Vaneet Aggarwal, Ather Gattami

This paper uses concepts from constrained optimization and Q-learning to propose an algorithm for CMDP with long-term constraints.

Q-Learning

Provably Efficient Model-Free Algorithm for MDPs with Peak Constraints

no code implementations11 Mar 2020 Qinbo Bai, Vaneet Aggarwal, Ather Gattami

The proposed algorithm is proved to achieve an $(\epsilon, p)$-PAC policy when the episode $K\geq\Omega(\frac{I^2H^6SA\ell}{\epsilon^2})$, where $S$ and $A$ are the number of states and actions, respectively.

Q-Learning Scheduling

Conditional Mutual information-based Contrastive Loss for Financial Time Series Forecasting

no code implementations18 Feb 2020 Hanwei Wu, Ather Gattami, Markus Flierl

One challenge of using deep learning models for finance forecasting is the shortage of available training data when using small datasets.

Representation Learning Time Series +1

Reinforcement Learning of Markov Decision Processes with Peak Constraints

no code implementations23 Jan 2019 Ather Gattami

We introduce a game theoretic approach to construct reinforcement learning algorithms where the agent maximizes an unconstrained objective that depends on the simulated action of the minimizing opponent which acts on a finite set of actions and the output data of the constraint functions (rewards).

Q-Learning reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.