no code implementations • 1 Feb 2024 • Weiqin Chen, James Onyejizu, Long Vu, Lan Hoang, Dharmashankar Subramanian, Koushik Kar, Sandipan Mishra, Santiago Paternain
In this paper, we propose, analyze and evaluate adaptive primal-dual (APD) methods for SRL, where two adaptive LRs are adjusted to the Lagrangian multipliers so as to optimize the policy in each iteration.
1 code implementation • 18 Jan 2024 • Arindam Chowdhury, Santiago Paternain, Gunjan Verma, Ananthram Swami, Santiago Segarra
The problem of optimal power allocation -- for maximizing a given network utility metric -- under instantaneous constraints has recently gained significant popularity.
no code implementations • 17 Jan 2024 • Anmol Dwivedi, Santiago Paternain, Ali Tajer
This paper considers the sequential design of remedial control actions in response to system anomalies for the ultimate objective of preventing blackouts.
no code implementations • 29 Jun 2023 • Weiqin Chen, Dharmashankar Subramanian, Santiago Paternain
Furthermore, we propose a Safe Primal-Dual algorithm that can leverage both SPGs to learn safe policies.
no code implementations • 2 Oct 2022 • Weiqin Chen, Dharmashankar Subramanian, Santiago Paternain
In particular, we consider the notion of probabilistic safety.
1 code implementation • 21 Jan 2022 • Sergio Rozada, Santiago Paternain, Antonio G. Marques
Value-function (VF) approximation is a central problem in Reinforcement Learning (RL).
no code implementations • 8 Mar 2021 • Luiz F. O. Chamon, Santiago Paternain, Miguel Calvo-Fullana, Alejandro Ribeiro
In this paper, we overcome this issue by learning in the empirical dual domain, where constrained statistical learning problems become unconstrained and deterministic.
no code implementations • 24 Feb 2021 • Miguel Calvo-Fullana, Luiz F. O. Chamon, Santiago Paternain
However, to transfer from learning safety to learning safely, there are two hurdles that need to be overcome: (i) it has to be possible to learn the policy without having to re-initialize the system; and (ii) the rollouts of the system need to be in themselves safe.
no code implementations • 23 Feb 2021 • Miguel Calvo-Fullana, Santiago Paternain, Luiz F. O. Chamon, Alejandro Ribeiro
Thus, as we illustrate by an example, while previous methods can fail at finding optimal policies, running the dual dynamics while executing the augmented policy yields an algorithm that provably samples actions from the optimal policy.
no code implementations • 11 Feb 2021 • Clark Zhang, Santiago Paternain, Alejandro Ribeiro
This paper introduces the constrained Sufficiently Accurate model learning approach, provides examples of such problems, and presents a theorem on how close some approximate solutions can be.
no code implementations • 24 Nov 2020 • Luiz F. O. Chamon, Santiago Paternain, Alejandro Ribeiro
Prediction credibility measures, in the form of confidence intervals or probability distributions, are fundamental in statistics and machine learning to characterize model robustness, detect out-of-distribution samples (outliers), and protect against adversarial attacks.
no code implementations • 16 Oct 2020 • Santiago Paternain, Juan Andres Bazerque, Alejandro Ribeiro
To that end we compute unbiased stochastic gradients of the value function which we use as ascent directions to update the policy.
no code implementations • L4DC 2020 • Luiz F.O. Chamon, Santiago Paternain, Alejandro Ribeiro
In recent years, considerable work has been done to tackle the issue of designing control laws based on observations to allow unknown dynamical systems to perform pre-specified tasks.
no code implementations • 12 Feb 2020 • Luiz. F. O. Chamon, Santiago Paternain, Miguel Calvo-Fullana, Alejandro Ribeiro
This paper is concerned with the study of constrained statistical learning problems, the unconstrained version of which are at the core of virtually all of modern information processing.
no code implementations • 20 Nov 2019 • Santiago Paternain, Miguel Calvo-Fullana, Luiz. F. O. Chamon, Alejandro Ribeiro
The advantages of the proposed relaxation are threefold.
no code implementations • NeurIPS 2019 • Santiago Paternain, Luiz. F. O. Chamon, Miguel Calvo-Fullana, Alejandro Ribeiro
The later is generally addressed by formulating the conflicting requirements as a constrained RL problem and solved using Primal-Dual methods.
no code implementations • 7 May 2019 • Maria Peifer, Luiz. F. O. Chamon, Santiago Paternain, Alejandro Ribeiro
To address the complexity issues, we then write the function estimation problem as a sparse functional program that explicitly minimizes the support of the representation leading to low complexity solutions.
no code implementations • 19 Feb 2019 • Clark Zhang, Arbaaz Khan, Santiago Paternain, Alejandro Ribeiro
In this paper, we investigate a method to regularize model learning techniques to provide better error characteristics for traditional control and planning algorithms.
no code implementations • 11 Oct 2017 • Alec Koppel, Santiago Paternain, Cedric Richard, Alejandro Ribeiro
That is, we establish that with constant step-size selections agents' functions converge to a neighborhood of the globally optimal one while satisfying the consensus constraints as the penalty parameter is increased.