Search Results for author: Wenlong Mou

Found 19 papers, 0 papers with code

On Bellman equations for continuous-time policy evaluation I: discretization and approximation

no code implementations8 Jul 2024 Wenlong Mou, Yuhua Zhu

We study the problem of computing the value function from a discretely-observed trajectory of a continuous-time diffusion process.

Reinforcement Learning (RL)

Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency

no code implementations16 Jan 2023 Wenlong Mou, Peng Ding, Martin J. Wainwright, Peter L. Bartlett

When it is violated, the classical semi-parametric efficiency bound can easily become infinite, so that the instance-optimal risk depends on the function class used to model the regression function.

regression

Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency

no code implementations26 Sep 2022 Wenlong Mou, Martin J. Wainwright, Peter L. Bartlett

The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures.

Causal Inference

Optimal variance-reduced stochastic approximation in Banach spaces

no code implementations21 Jan 2022 Wenlong Mou, Koulik Khamaru, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space.

Q-Learning

Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

no code implementations23 Dec 2021 Wenlong Mou, Ashwin Pananjady, Martin J. Wainwright, Peter L. Bartlett

We then prove a non-asymptotic instance-dependent bound on a suitably averaged sequence of iterates, with a leading term that matches the local asymptotic minimax limit, including sharp dependence on the parameters $(d, t_{\mathrm{mix}})$ in the higher order terms.

Model Selection

Optimal oracle inequalities for solving projected fixed-point equations

no code implementations9 Dec 2020 Wenlong Mou, Ashwin Pananjady, Martin J. Wainwright

Linear fixed point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations.

ROOT-SGD: Sharp Nonasymptotics and Near-Optimal Asymptotics in a Single Algorithm

no code implementations28 Aug 2020 Chris Junchi Li, Wenlong Mou, Martin J. Wainwright, Michael I. Jordan

We study the problem of solving strongly convex and smooth unconstrained optimization problems using stochastic first-order algorithms.

Stochastic Optimization Unity

On the Sample Complexity of Reinforcement Learning with Policy Space Generalization

no code implementations17 Aug 2020 Wenlong Mou, Zheng Wen, Xi Chen

To avoid such undesirable dependence on the state and action space sizes, this paper proposes a new notion of eluder dimension for the policy space, which characterizes the intrinsic complexity of policy learning in an arbitrary Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

no code implementations9 Apr 2020 Wenlong Mou, Chris Junchi Li, Martin J. Wainwright, Peter L. Bartlett, Michael. I. Jordan

When the matrix $\bar{A}$ is Hurwitz, we prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity.

Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing

no code implementations11 Dec 2019 Wenlong Mou, Nhat Ho, Martin J. Wainwright, Peter L. Bartlett, Michael. I. Jordan

We study the problem of sampling from the power posterior distribution in Bayesian Gaussian mixture models, a robust version of the classical posterior.

An Efficient Sampling Algorithm for Non-smooth Composite Potentials

no code implementations1 Oct 2019 Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, Peter L. Bartlett

We consider the problem of sampling from a density of the form $p(x) \propto \exp(-f(x)- g(x))$, where $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a smooth and strongly convex function and $g: \mathbb{R}^d \rightarrow \mathbb{R}$ is a convex and Lipschitz function.

High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

no code implementations28 Aug 2019 Wenlong Mou, Yi-An Ma, Martin J. Wainwright, Peter L. Bartlett, Michael. I. Jordan

We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities.

Vocal Bursts Intensity Prediction

Differentially Private Clustering in High-Dimensional Euclidean Spaces

no code implementations ICML 2017 Maria-Florina Balcan, Travis Dick, YIngyu Liang, Wenlong Mou, Hongyang Zhang

We study the problem of clustering sensitive data while preserving the privacy of individuals represented in the dataset, which has broad applications in practical machine learning and data analysis tasks.

Clustering Vocal Bursts Intensity Prediction

Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints

no code implementations19 Jul 2017 Wenlong Mou, Li-Wei Wang, Xiyu Zhai, Kai Zheng

This is the first algorithm-dependent result with reasonable dependence on aggregated step sizes for non-convex learning, and has important implications to statistical learning aspects of stochastic gradient methods in complicated models such as deep learning.

Generalization Bounds Learning Theory +1

Collect at Once, Use Effectively: Making Non-interactive Locally Private Learning Possible

no code implementations ICML 2017 Kai Zheng, Wenlong Mou, Li-Wei Wang

For learning with smooth generalized linear losses, we propose an approximate stochastic gradient oracle estimated from non-interactive LDP channel, using Chebyshev expansion.

regression

Efficient Private ERM for Smooth Objectives

no code implementations29 Mar 2017 Jiaqi Zhang, Kai Zheng, Wenlong Mou, Li-Wei Wang

For strongly convex and smooth objectives, we prove that gradient descent with output perturbation not only achieves nearly optimal utility, but also significantly improves the running time of previous state-of-the-art private optimization algorithms, for both $\epsilon$-DP and $(\epsilon, \delta)$-DP.

Stable Memory Allocation in the Hippocampus: Fundamental Limits and Neural Realization

no code implementations14 Dec 2016 Wenlong Mou, Zhi Wang, Li-Wei Wang

In Valiant's neuroidal model, the hippocampus was described as a randomly connected graph, the computation on which maps input to a set of activated neuroids with stable size.

Hippocampus Memorization

Cannot find the paper you are looking for? You can Submit a new open access paper.