Search Results for author: Sean P. Meyn

Found 9 papers, 0 papers with code

Dual Ensemble Kalman Filter for Stochastic Optimal Control

no code implementations • 10 Apr 2024 • Anant A. Joshi, Amirhossein Taghvaei, Prashant G. Mehta, Sean P. Meyn

In this paper, stochastic optimal control problems in continuous time and space are considered.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Controlled Interacting Particle Algorithms for Simulation-based Reinforcement Learning

no code implementations • 2 Jul 2021 • Anant Joshi, Amirhossein Taghvaei, Prashant G. Mehta, Sean P. Meyn

This paper is concerned with optimal control problems for control systems in continuous time, and interacting particle system methods designed to construct approximate control solutions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Convex Q-Learning, Part 1: Deterministic Optimal Control

no code implementations • 8 Aug 2020 • Prashant G. Mehta, Sean P. Meyn

It is shown that in fact the algorithms are very different: while convex Q-learning solves a convex program that approximates the Bellman equation, theory for DQN is no stronger than for Watkins' algorithm with function approximation: (a) it is shown that both seek solutions to the same fixed point equation, and (b) the ODE approximations for the two algorithms coincide, and little is known about the stability of this ODE.

Q-Learning

Paper
Add Code

Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning

no code implementations • 24 Feb 2020 • Adithya M. Devraj, Sean P. Meyn

Sample complexity bounds are a common performance metric in the Reinforcement Learning literature.

Q-Learning

Paper
Add Code

Zap Q-Learning With Nonlinear Function Approximation

no code implementations • NeurIPS 2020 • Shuhang Chen, Adithya M. Devraj, Fan Lu, Ana Bušić, Sean P. Meyn

Based on multiple experiments with a range of neural network sizes, it is found that the new algorithms converge quickly and are robust to choice of function approximation architecture.

OpenAI Gym Q-Learning

Paper
Add Code

Zap Q-Learning for Optimal Stopping Time Problems

no code implementations • 25 Apr 2019 • Shuhang Chen, Adithya M. Devraj, Ana Bušić, Sean P. Meyn

The objective in this paper is to obtain fast converging reinforcement learning algorithms to approximate solutions to the problem of discounted cost optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on a compact subset of $\mathbb{R}^n$.

Q-Learning

Paper
Add Code

Differential Temporal Difference Learning

no code implementations • 28 Dec 2018 • Adithya M. Devraj, Ioannis Kontoyiannis, Sean P. Meyn

Value functions derived from Markov decision processes arise as a central component of algorithms as well as performance metrics in many statistics and engineering applications of machine learning techniques.

General Reinforcement Learning

Paper
Add Code

Fastest Convergence for Q-learning

no code implementations • 12 Jul 2017 • Adithya M. Devraj, Sean P. Meyn

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects.

Q-Learning reinforcement-learning +1

Paper
Add Code

Differential TD Learning for Value Function Approximation

no code implementations • 6 Apr 2016 • Adithya M. Devraj, Sean P. Meyn

The algorithm introduced in this paper is intended to resolve two well-known problems with this approach: In the discounted-cost setting, the variance of the algorithm diverges as the discount factor approaches unity.

Unity

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.