Search Results for author: Anas Barakat

Found 9 papers, 2 papers with code

Policy Mirror Descent with Lookahead

no code implementations21 Mar 2024 Kimon Protopapas, Anas Barakat

In this work, we propose a new class of PMD algorithms called $h$-PMD which incorporates multi-step greedy policy improvement with lookahead depth $h$ to the PMD update rule.

Reinforcement Learning (RL)

Independent Learning in Constrained Markov Potential Games

1 code implementation27 Feb 2024 Philip Jordan, Anas Barakat, Niao He

We propose an independent policy gradient algorithm for learning approximate constrained Nash equilibria: Each agent observes their own actions and rewards, along with a shared state.

Multi-agent Reinforcement Learning

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

1 code implementation8 Sep 2023 Jiduan Wu, Anas Barakat, Ilyas Fatkhullin, Niao He

Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator.

Multi-agent Reinforcement Learning Policy Gradient Methods

Reinforcement Learning with General Utilities: Simpler Variance Reduction and Large State-Action Space

no code implementations2 Jun 2023 Anas Barakat, Ilyas Fatkhullin, Niao He

We consider the reinforcement learning (RL) problem with general utilities which consists in maximizing a function of the state-action occupancy measure.

Reinforcement Learning (RL)

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

no code implementations3 Feb 2023 Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He

Recently, the impressive empirical success of policy gradient (PG) methods has catalyzed the development of their theoretical foundations.

Policy Gradient Methods

Analysis of a Target-Based Actor-Critic Algorithm with Linear Function Approximation

no code implementations14 Jun 2021 Anas Barakat, Pascal Bianchi, Julien Lehmann

Actor-critic methods integrating target networks have exhibited a stupendous empirical success in deep reinforcement learning.

Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Non Convex Optimization

no code implementations18 Nov 2019 Anas Barakat, Pascal Bianchi

In this work, we study the ADAM algorithm for smooth nonconvex optimization under a boundedness assumption on the adaptive learning rate.

Convergence Analysis of a Momentum Algorithm with Adaptive Step Size for Nonconvex Optimization

no code implementations25 Sep 2019 Anas Barakat, Pascal Bianchi

In this work, we study the algorithm for smooth nonconvex optimization under a boundedness assumption on the adaptive learning rate.

Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization

no code implementations4 Oct 2018 Anas Barakat, Pascal Bianchi

In the constant stepsize regime, assuming that the objective function is differentiable and non-convex, we establish the convergence in the long run of the iterates to a stationary point under a stability condition.

Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.