no code implementations • 1 Mar 2023 • Pedro Cisneros-Velarde, Sanmi Koyejo
Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game.
no code implementations • 29 Sep 2022 • Arindam Banerjee, Pedro Cisneros-Velarde, Libin Zhu, Mikhail Belkin
Second, we introduce a new analysis of optimization based on Restricted Strong Convexity (RSC) which holds as long as the squared norm of the average gradient of predictors is $\Omega(\frac{\text{poly}(L)}{\sqrt{m}})$ for the square loss.
1 code implementation • 7 Jun 2022 • Amnon Attali, Pedro Cisneros-Velarde, Marco Morales, Nancy M. Amato
While the difficulty of reinforcement learning problems is typically related to the complexity of their state spaces, Abstraction proposes that solutions often lie in simpler underlying latent spaces.
no code implementations • 31 May 2022 • Pedro Cisneros-Velarde, Boxiang Lyu, Sanmi Koyejo, Mladen Kolar
Although parallelism has been extensively used in reinforcement learning (RL), the quantitative effects of parallel exploration are not well understood theoretically.
no code implementations • 18 May 2021 • Pedro Cisneros-Velarde, Francesco Bullo
Much recent interest has focused on the design of optimization algorithms from the discretization of an associated optimization flow, i. e., a system of differential equations (ODEs) whose trajectories solve an associated optimization problem.
no code implementations • 15 Dec 2020 • Pedro Cisneros-Velarde, Francesco Bullo
Consider a multi-agent system whereby each agent has an initial probability measure.
no code implementations • 27 Mar 2020 • Pedro Cisneros-Velarde, Saber Jafarpour, Francesco Bullo
In this note, we provide an overarching analysis of primal-dual dynamics associated to linear equality-constrained optimization problems using contraction analysis.
1 code implementation • 22 May 2019 • Pedro Cisneros-Velarde, Sang-Yun Oh, Alexander Petersen
As a consequence of this formulation, the radius of the Wasserstein ambiguity set is directly related to the regularization parameter in the estimation problem.