Search Results for author: Alexey Naumov

Found 17 papers, 4 papers with code

SCAFFLSA: Quantifying and Eliminating Heterogeneity Bias in Federated Linear Stochastic Approximation and Temporal Difference Learning

no code implementations • 6 Feb 2024 • Paul Mangold, Sergey Samsonov, Safwan Labbi, Ilya Levin, REDA ALAMI, Alexey Naumov, Eric Moulines

In this paper, we perform a non-asymptotic analysis of the federated linear stochastic approximation (FedLSA) algorithm.

Paper
Add Code

Demonstration-Regularized RL

no code implementations • 26 Oct 2023 • Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Alexey Naumov, Pierre Perrault, Michal Valko, Pierre Menard

In particular, we study the demonstration-regularized reinforcement learning that leverages the expert demonstrations by KL-regularization for a policy learned by behavior cloning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite-Sample Analysis of the Temporal Difference Learning

no code implementations • 22 Oct 2023 • Sergey Samsonov, Daniil Tiapkin, Alexey Naumov, Eric Moulines

In this paper we consider the problem of obtaining sharp bounds for the performance of temporal difference (TD) methods with linear functional approximation for policy evaluation in discounted Markov Decision Processes.

Paper
Add Code

Generative Flow Networks as Entropy-Regularized RL

1 code implementation • 19 Oct 2023 • Daniil Tiapkin, Nikita Morozov, Alexey Naumov, Dmitry Vetrov

We demonstrate how the task of learning a generative flow network can be efficiently redefined as an entropy-regularized RL problem with a specific reward and regularizer structure.

Paper
Code

First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities

no code implementations • NeurIPS 2023 • Aleksandr Beznosikov, Sergey Samsonov, Marina Sheshukova, Alexander Gasnikov, Alexey Naumov, Eric Moulines

We present a unified approach for the theoretical analysis of first-order gradient methods for stochastic optimization and variational inequalities.

Stochastic Optimization

Paper
Add Code

Sharp Deviations Bounds for Dirichlet Weighted Sums with Application to analysis of Bayesian algorithms

no code implementations • 6 Apr 2023 • Denis Belomestny, Pierre Menard, Alexey Naumov, Daniil Tiapkin, Michal Valko

These bounds are based on a novel integral representation of the density of a weighted Dirichlet sum.

Multi-Armed Bandits Thompson Sampling

Paper
Add Code

Theoretical guarantees for neural control variates in MCMC

no code implementations • 3 Apr 2023 • Denis Belomestny, Artur Goldman, Alexey Naumov, Sergey Samsonov

In this paper, we propose a variance reduction approach for Markov chains based on additive control variates and the minimization of an appropriate estimate for the asymptotic variance.

Paper
Add Code

Fast Rates for Maximum Entropy Exploration

1 code implementation • 14 Mar 2023 • Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Pierre Perrault, Yunhao Tang, Michal Valko, Pierre Menard

Finally, we apply developed regularization techniques to reduce sample complexity of visitation entropy maximization to $\widetilde{\mathcal{O}}(H^2SA/\varepsilon^2)$, yielding a statistical separation between maximum entropy exploration and reward-free exploration.

Reinforcement Learning (RL)

Paper
Code

Rosenthal-type inequalities for linear statistics of Markov chains

no code implementations • 10 Mar 2023 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Marina Sheshukova

In this paper, we establish novel deviation bounds for additive functionals of geometrically ergodic Markov chains similar to Rosenthal and Bernstein inequalities for sums of independent random variables.

Vocal Bursts Type Prediction

Paper
Add Code

Optimistic Posterior Sampling for Reinforcement Learning with Few Samples and Tight Guarantees

1 code implementation • 28 Sep 2022 • Daniil Tiapkin, Denis Belomestny, Daniele Calandriello, Eric Moulines, Remi Munos, Alexey Naumov, Mark Rowland, Michal Valko, Pierre Menard

We consider reinforcement learning in an environment modeled by an episodic, finite, stage-dependent Markov decision process of horizon $H$ with $S$ states, and $A$ actions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation

no code implementations • 10 Jul 2022 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov

Our finite-time instance-dependent bounds for the averaged LSA iterates are sharp in the sense that the leading term we obtain coincides with the local asymptotic minimax limit.

Paper
Add Code

From Dirichlet to Rubin: Optimistic Exploration in RL without Bonuses

no code implementations • 16 May 2022 • Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Menard

We propose the Bayes-UCBVI algorithm for reinforcement learning in tabular, stage-dependent, episodic Markov decision process: a natural extension of the Bayes-UCB algorithm by Kaufmann et al. (2012) for multi-armed bandits.

Multi-Armed Bandits

Paper
Add Code

Local-Global MCMC kernels: the best of both worlds

1 code implementation • 4 Nov 2021 • Sergey Samsonov, Evgeny Lagutin, Marylou Gabrié, Alain Durmus, Alexey Naumov, Eric Moulines

Recent works leveraging learning to enhance sampling have shown promising results, in particular by designing effective non-local moves and global proposals.

Paper
Code

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

no code implementations • NeurIPS 2021 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Kevin Scaman, Hoi-To Wai

This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$.

Vocal Bursts Intensity Prediction

Paper
Add Code

Rates of convergence for density estimation with generative adversarial networks

no code implementations • 30 Jan 2021 • Nikita Puchkin, Sergey Samsonov, Denis Belomestny, Eric Moulines, Alexey Naumov

In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs).

Density Estimation

Paper
Add Code

On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

no code implementations • 30 Jan 2021 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Hoi-To Wai

This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

no code implementations • 4 Feb 2020 • Maxim Kaledin, Eric Moulines, Alexey Naumov, Vladislav Tadic, Hoi-To Wai

Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain.

Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.