Search Results for author: Enlu Zhou

Found 12 papers, 2 papers with code

A Diffusion Approximation Theory of Momentum SGD in Nonconvex Optimization

no code implementations14 Feb 2018 Tianyi Liu, Zhehui Chen, Enlu Zhou, Tuo Zhao

Our theoretical discovery partially corroborates the empirical success of MSGD in training deep neural networks.

Bayesian Inference Dimensionality Reduction +1

Towards Understanding Acceleration Tradeoff between Momentum and Asynchrony in Nonconvex Stochastic Optimization

no code implementations NeurIPS 2018 Tianyi Liu, Shiyang Li, Jianping Shi, Enlu Zhou, Tuo Zhao

Asynchronous momentum stochastic gradient descent algorithms (Async-MSGD) is one of the most popular algorithms in distributed machine learning.

Stochastic Optimization

Towards Understanding the Importance of Noise in Training Neural Networks

no code implementations7 Sep 2019 Mo Zhou, Tianyi Liu, Yan Li, Dachao Lin, Enlu Zhou, Tuo Zhao

Numerous empirical evidence has corroborated that the noise plays a crucial rule in effective and efficient training of neural networks.

Towards Understanding the Importance of Shortcut Connections in Residual Networks

no code implementations NeurIPS 2019 Tianyi Liu, Minshuo Chen, Mo Zhou, Simon S. Du, Enlu Zhou, Tuo Zhao

We show, however, that gradient descent combined with proper normalization, avoids being trapped by the spurious local optimum, and converges to a global optimum in polynomial time, when the weight of the first layer is initialized at 0, and that of the second layer is initialized arbitrarily in a ball.

Bayesian Optimization of Risk Measures

2 code implementations NeurIPS 2020 Sait Cakmak, Raul Astudillo, Peter Frazier, Enlu Zhou

We consider Bayesian optimization of objective functions of the form $\rho[ F(x, W) ]$, where $F$ is a black-box expensive-to-evaluate function and $\rho$ denotes either the VaR or CVaR risk measure, computed with respect to the randomness induced by the environmental random variable $W$.

Bayesian Optimization Decision Making +2

Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization

no code implementations24 Feb 2021 Tianyi Liu, Yan Li, Song Wei, Enlu Zhou, Tuo Zhao

Numerous empirical evidences have corroborated the importance of noise in nonconvex optimization problems.

Bayesian Risk Markov Decision Processes

no code implementations4 Jun 2021 Yifan Lin, Yuxuan Ren, Enlu Zhou

We consider finite-horizon Markov Decision Processes where parameters, such as transition probabilities, are unknown and estimated from data.

Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably

no code implementations7 Feb 2022 Tianyi Liu, Yan Li, Enlu Zhou, Tuo Zhao

We investigate the role of noise in optimization algorithms for learning over-parameterized models.

Robust Multi-Objective Bayesian Optimization Under Input Noise

1 code implementation15 Feb 2022 Samuel Daulton, Sait Cakmak, Maximilian Balandat, Michael A. Osborne, Enlu Zhou, Eytan Bakshy

In many manufacturing processes, the design parameters are subject to random input noise, resulting in a product that is often less performant than expected.

Bayesian Optimization

Risk-averse Contextual Multi-armed Bandit Problem with Linear Payoffs

no code implementations24 Jun 2022 Yifan Lin, Yuhao Wang, Enlu Zhou

In particular, we consider mean-variance as the risk criterion, and the best arm is the one with the largest mean-variance reward.

Thompson Sampling

Approximate Bilevel Difference Convex Programming for Bayesian Risk Markov Decision Processes

no code implementations26 Jan 2023 Yifan Lin, Enlu Zhou

We consider infinite-horizon Markov Decision Processes where parameters, such as transition probabilities, are unknown and estimated from data.

Reusing Historical Trajectories in Natural Policy Gradient via Importance Sampling: Convergence and Convergence Rate

no code implementations1 Mar 2024 Yifan Lin, Yuhao Wang, Enlu Zhou

The efficient utilization of historical trajectories obtained from previous policies is essential for expediting policy optimization.

Policy Gradient Methods

Cannot find the paper you are looking for? You can Submit a new open access paper.