Search Results for author: Sean Meyn

Found 14 papers, 0 papers with code

Convex Q Learning in a Stochastic Environment: Extended Version

no code implementations10 Sep 2023 Fan Lu, Sean Meyn

The main contributions firstly concern properties of the relaxation, described as a deterministic convex program: we identify conditions for a bounded solution, and a significant relationship between the solution to the new convex program, and the solution to standard Q-learning.

Q-Learning

The Curse of Memory in Stochastic Approximation: Extended Version

no code implementations6 Sep 2023 Caio Kalil Lauand, Sean Meyn

The remaining results are established for linear SA recursions: (ii) the bivariate parameter-disturbance process is geometrically ergodic in a topological sense; (iii) the representation for bias has a simpler form in this case, and cannot be expected to be zero if there is multiplicative noise; (iv) the asymptotic covariance of the averaged parameters is within $O(\alpha)$ of optimal.

Stability of Q-Learning Through Design and Optimism

no code implementations5 Jul 2023 Sean Meyn

The algorithm is a general approach to stochastic approximation which in particular applies to Q-learning with "oblivious" training even with non-linear function approximation.

Q-Learning

High-Impedance Non-Linear Fault Detection via Eigenvalue Analysis with low PMU Sampling Rates

no code implementations10 Jan 2023 Gian Paramo, Arturo Bretas, Sean Meyn

This technique holds several advantages over contemporary techniques: It utilizes technology that is already deployed in the field, it offers a significant degree of generality, and so far it has displayed a very high-level of sensitivity without sacrificing accuracy.

Fault Detection

High Impedance Fault Detection Through Quasi-Static State Estimation: A Parameter Error Modeling Approach

no code implementations20 Dec 2022 Austin Cooper, Arturo Bretas, Sean Meyn, Newton G. Bretas

This paper presents a model for detecting high-impedance faults (HIFs) using parameter error modeling and a two-step per-phase weighted least squares state estimation (SE) process.

Fault Detection

Uncertainty Error Modeling for Non-Linear State Estimation With Unsynchronized SCADA and $μ$PMU Measurements

no code implementations20 Dec 2022 Austin Cooper, Arturo Bretas, Sean Meyn, Newton G. Bretas

Distribution systems of the future smart grid require enhancements to the reliability of distribution system state estimation (DSSE) in the face of low measurement redundancy, unsynchronized measurements, and dynamic load profiles.

Sufficient Exploration for Convex Q-learning

no code implementations17 Oct 2022 Fan Lu, Prashant Mehta, Sean Meyn, Gergely Neu

The main contributions follow: (i) The dual of convex Q-learning is not precisely Manne's LP or a version of logistic Q-learning, but has similar structure that reveals the need for regularization to avoid over-fitting.

OpenAI Gym Q-Learning

Model-Free Characterizations of the Hamilton-Jacobi-Bellman Equation and Convex Q-Learning in Continuous Time

no code implementations14 Oct 2022 Fan Lu, Joel Mathias, Sean Meyn, Karanjit Kalsi

Convex Q-learning is a recent approach to reinforcement learning, motivated by the possibility of a firmer theory for convergence, and the possibility of making use of greater a priori knowledge regarding policy or value function structure.

Q-Learning

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

no code implementations27 Oct 2021 Vivek Borkar, Shuhang Chen, Adithya Devraj, Ioannis Kontoyiannis, Sean Meyn

In addition to standard Lipschitz assumptions and conditions on the vanishing step-size sequence, it is assumed that the associated \textit{mean flow} $ \tfrac{d}{dt} \vartheta_t = \bar{f}(\vartheta_t)$, is globally asymptotically stable with stationary point denoted $\theta^*$, where $\bar{f}(\theta)=\text{ E}[f(\theta,\Phi)]$ with $\Phi$ having the stationary distribution of the chain.

reinforcement-learning Reinforcement Learning (RL)

Accelerating Optimization and Reinforcement Learning with Quasi-Stochastic Approximation

no code implementations30 Sep 2020 Shuhang Chen, Adithya Devraj, Andrey Bernstein, Sean Meyn

(ii) With gain $a_t = g/(1+t)$ the results are not as sharp: the rate of convergence $1/t$ holds only if $I + g A^*$ is Hurwitz.

reinforcement-learning Reinforcement Learning (RL)

Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning

no code implementations17 Sep 2018 Adithya M. Devraj, Ana Bušić, Sean Meyn

There are two well known SA techniques that are known to have optimal asymptotic variance: the Ruppert-Polyak averaging technique, and stochastic Newton-Raphson (SNR).

Q-Learning Stochastic Optimization

Zap Q-Learning

no code implementations NeurIPS 2017 Adithya M. Devraj, Sean Meyn

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects.

Q-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.