no code implementations • 29 Feb 2024 • Ather Gattami
That is, when the number of episodes goes to infinity, there exists a constant $C$ such that \[\|w-w^\star\| \le C\cdot\frac{\rho}{T}.\] In particular, our algorithm converges arbitrarily close to the optimal neural network parameters as the time horizon increases or as the regularization parameter decreases.
no code implementations • 27 Jan 2023 • Johan Östman, Ather Gattami, Daniel Gillblad
We consider a decentralized multiplayer game, played over $T$ rounds, with a leader-follower hierarchy described by a directed acyclic graph.
no code implementations • 10 Jun 2020 • Qinbo Bai, Vaneet Aggarwal, Ather Gattami
This paper uses concepts from constrained optimization and Q-learning to propose an algorithm for CMDP with long-term constraints.
no code implementations • 11 Mar 2020 • Qinbo Bai, Vaneet Aggarwal, Ather Gattami
The proposed algorithm is proved to achieve an $(\epsilon, p)$-PAC policy when the episode $K\geq\Omega(\frac{I^2H^6SA\ell}{\epsilon^2})$, where $S$ and $A$ are the number of states and actions, respectively.
no code implementations • 18 Feb 2020 • Hanwei Wu, Ather Gattami, Markus Flierl
One challenge of using deep learning models for finance forecasting is the shortage of available training data when using small datasets.
no code implementations • 23 Jan 2019 • Ather Gattami
We introduce a game theoretic approach to construct reinforcement learning algorithms where the agent maximizes an unconstrained objective that depends on the simulated action of the minimizing opponent which acts on a finite set of actions and the output data of the constraint functions (rewards).