Search Results for author: Cong Fang

Found 18 papers, 5 papers with code

Policy Representation via Diffusion Probability Model for Reinforcement Learning

1 code implementation22 May 2023 Long Yang, Zhixiong Huang, Fenghao Lei, Yucun Zhong, Yiming Yang, Cong Fang, Shiting Wen, Binbin Zhou, Zhouchen Lin

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration.

Continuous Control reinforcement-learning +1

Environment Invariant Linear Least Squares

no code implementations6 Mar 2023 Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang

The joint distribution of the response variable and covariate may vary across different environments, yet the conditional expectation of $y$ given the unknown set of important variables are invariant across environments.

Causal Inference Transfer Learning +1

Provable Particle-based Primal-Dual Algorithm for Mixed Nash Equilibrium

no code implementations2 Mar 2023 Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang

We consider the general nonconvex nonconcave minimax problem over continuous variables.

Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training

1 code implementation29 Jan 2021 Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.

Mathematical Models of Overparameterized Neural Networks

1 code implementation27 Dec 2020 Cong Fang, Hanze Dong, Tong Zhang

Deep learning has received considerable empirical successes in recent years.

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks

1 code implementation NeurIPS 2020 Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, Tong Zhang

With the help of a new technique called {\it neural network grafting}, we demonstrate that even during the entire training process, feature distributions of differently initialized networks remain similar at each layer.

Improved Analysis of Clipping Algorithms for Non-convex Optimization

1 code implementation NeurIPS 2020 Bohang Zhang, Jikai Jin, Cong Fang, LiWei Wang

Gradient clipping is commonly used in training deep neural networks partly due to its practicability in relieving the exploding gradient problem.

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

no code implementations3 Jul 2020 Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang

This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization.

Convex Formulation of Overparameterized Deep Neural Networks

no code implementations18 Nov 2019 Cong Fang, Yihong Gu, Weizhong Zhang, Tong Zhang

This new analysis is consistent with empirical observations that deep neural networks are capable of learning efficient feature representations.

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations

no code implementations25 Oct 2019 Cong Fang, Hanze Dong, Tong Zhang

Recently, over-parameterized neural networks have been extensively analyzed in the literature.

A Stochastic Trust Region Method for Non-convex Minimization

no code implementations ICLR 2020 Zebang Shen, Pan Zhou, Cong Fang, Alejandro Ribeiro

We target the problem of finding a local minimum in non-convex finite-sum minimization.

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points

no code implementations1 Feb 2019 Cong Fang, Zhouchen Lin, Tong Zhang

In this paper, we give a sharp analysis for Stochastic Gradient Descent (SGD) and prove that SGD is able to efficiently escape from saddle points and find an $(\epsilon, O(\epsilon^{0. 5}))$-approximate second-order stationary point in $\tilde{O}(\epsilon^{-3. 5})$ stochastic gradient computations for generic nonconvex optimization problems, when the objective function satisfies gradient-Lipschitz, Hessian-Lipschitz, and dispersive noise assumptions.

Stochastic Optimization

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator

no code implementations NeurIPS 2018 Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

Specially, we prove that the SPIDER-SFO algorithm achieves a gradient computation cost of $\mathcal{O}\left( \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3} ) \right)$ to find an $\epsilon$-approximate first-order stationary point.

Stochastic Optimization

Lifted Proximal Operator Machines

no code implementations5 Nov 2018 Jia Li, Cong Fang, Zhouchen Lin

LPOM is block multi-convex in all layer-wise weights and activations.

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator

no code implementations NeurIPS 2018 Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only.

Stochastic Optimization

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

no code implementations27 Feb 2018 Cong Fang, Yameng Huang, Zhouchen Lin

$O(1/\epsilon)$) convergence rate for non-strongly convex functions, and $O(\sqrt{\kappa}\log(1/\epsilon))$ (v. s.

Cannot find the paper you are looking for? You can Submit a new open access paper.