Search Results for author: Cong Fang

Found 22 papers, 5 papers with code

Policy Representation via Diffusion Probability Model for Reinforcement Learning

1 code implementation • 22 May 2023 • Long Yang, Zhixiong Huang, Fenghao Lei, Yucun Zhong, Yiming Yang, Cong Fang, Shiting Wen, Binbin Zhou, Zhouchen Lin

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration.

Continuous Control reinforcement-learning +1

Paper
Code

Mathematical Models of Overparameterized Neural Networks

1 code implementation • 27 Dec 2020 • Cong Fang, Hanze Dong, Tong Zhang

Deep learning has received considerable empirical successes in recent years.

Paper
Code

Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training

1 code implementation • 29 Jan 2021 • Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.

Paper
Code

Improved Analysis of Clipping Algorithms for Non-convex Optimization

1 code implementation • NeurIPS 2020 • Bohang Zhang, Jikai Jin, Cong Fang, LiWei Wang

Gradient clipping is commonly used in training deep neural networks partly due to its practicability in relieving the exploding gradient problem.

Paper
Code

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

no code implementations • 27 Feb 2018 • Cong Fang, Yameng Huang, Zhouchen Lin

$O(1/\epsilon)$) convergence rate for non-strongly convex functions, and $O(\sqrt{\kappa}\log(1/\epsilon))$ (v. s.

Paper
Add Code

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator

no code implementations • NeurIPS 2018 • Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only.

Stochastic Optimization

Paper
Add Code

Lifted Proximal Operator Machines

no code implementations • 5 Nov 2018 • Jia Li, Cong Fang, Zhouchen Lin

LPOM is block multi-convex in all layer-wise weights and activations.

Paper
Add Code

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator

no code implementations • NeurIPS 2018 • Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

Specially, we prove that the SPIDER-SFO algorithm achieves a gradient computation cost of $\mathcal{O}\left( \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3} ) \right)$ to find an $\epsilon$-approximate first-order stationary point.

Stochastic Optimization

Paper
Add Code

Hessian-Aware Zeroth-Order Optimization for Black-Box Adversarial Attack

no code implementations • 29 Dec 2018 • Haishan Ye, Zhichao Huang, Cong Fang, Chris Junchi Li, Tong Zhang

Zeroth-order optimization is an important research topic in machine learning.

Adversarial Attack

Paper
Add Code

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points

no code implementations • 1 Feb 2019 • Cong Fang, Zhouchen Lin, Tong Zhang

In this paper, we give a sharp analysis for Stochastic Gradient Descent (SGD) and prove that SGD is able to efficiently escape from saddle points and find an $(\epsilon, O(\epsilon^{0. 5}))$-approximate second-order stationary point in $\tilde{O}(\epsilon^{-3. 5})$ stochastic gradient computations for generic nonconvex optimization problems, when the objective function satisfies gradient-Lipschitz, Hessian-Lipschitz, and dispersive noise assumptions.

Stochastic Optimization

Paper
Add Code

A Stochastic Trust Region Method for Non-convex Minimization

no code implementations • ICLR 2020 • Zebang Shen, Pan Zhou, Cong Fang, Alejandro Ribeiro

We target the problem of finding a local minimum in non-convex finite-sum minimization.

Paper
Add Code

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations

no code implementations • 25 Oct 2019 • Cong Fang, Hanze Dong, Tong Zhang

Recently, over-parameterized neural networks have been extensively analyzed in the literature.

Paper
Add Code

Convex Formulation of Overparameterized Deep Neural Networks

no code implementations • 18 Nov 2019 • Cong Fang, Yihong Gu, Weizhong Zhang, Tong Zhang

This new analysis is consistent with empirical observations that deep neural networks are capable of learning efficient feature representations.

Paper
Add Code

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

no code implementations • 3 Jul 2020 • Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang

This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization.

Paper
Add Code

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks

1 code implementation • NeurIPS 2020 • Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, Tong Zhang

With the help of a new technique called {\it neural network grafting}, we demonstrate that even during the entire training process, feature distributions of differently initialized networks remain similar at each layer.

Paper
Code

A Roadmap for Big Model

no code implementations • 26 Mar 2022 • Sha Yuan, Hanyu Zhao, Shuai Zhao, Jiahong Leng, Yangxiao Liang, Xiaozhi Wang, Jifan Yu, Xin Lv, Zhou Shao, Jiaao He, Yankai Lin, Xu Han, Zhenghao Liu, Ning Ding, Yongming Rao, Yizhao Gao, Liang Zhang, Ming Ding, Cong Fang, Yisen Wang, Mingsheng Long, Jing Zhang, Yinpeng Dong, Tianyu Pang, Peng Cui, Lingxiao Huang, Zheng Liang, HuaWei Shen, HUI ZHANG, Quanshi Zhang, Qingxiu Dong, Zhixing Tan, Mingxuan Wang, Shuo Wang, Long Zhou, Haoran Li, Junwei Bao, Yingwei Pan, Weinan Zhang, Zhou Yu, Rui Yan, Chence Shi, Minghao Xu, Zuobai Zhang, Guoqiang Wang, Xiang Pan, Mengjie Li, Xiaoyu Chu, Zijun Yao, Fangwei Zhu, Shulin Cao, Weicheng Xue, Zixuan Ma, Zhengyan Zhang, Shengding Hu, Yujia Qin, Chaojun Xiao, Zheni Zeng, Ganqu Cui, Weize Chen, Weilin Zhao, Yuan YAO, Peng Li, Wenzhao Zheng, Wenliang Zhao, Ziyi Wang, Borui Zhang, Nanyi Fei, Anwen Hu, Zenan Ling, Haoyang Li, Boxi Cao, Xianpei Han, Weidong Zhan, Baobao Chang, Hao Sun, Jiawen Deng, Chujie Zheng, Juanzi Li, Lei Hou, Xigang Cao, Jidong Zhai, Zhiyuan Liu, Maosong Sun, Jiwen Lu, Zhiwu Lu, Qin Jin, Ruihua Song, Ji-Rong Wen, Zhouchen Lin, LiWei Wang, Hang Su, Jun Zhu, Zhifang Sui, Jiajun Zhang, Yang Liu, Xiaodong He, Minlie Huang, Jian Tang, Jie Tang

With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm.

Language Modelling Machine Translation +1

Paper
Add Code

PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium

no code implementations • 2 Mar 2023 • Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang

To circumvent this difficulty, we examine the problem of identifying a mixed Nash equilibrium, where strategies are randomized and characterized by probability distributions over continuous domains. To this end, we propose PArticle-based Primal-dual ALgorithm (PAPAL) tailored for a weakly entropy-regularized min-max optimization over probability distributions.

Paper
Add Code

Environment Invariant Linear Least Squares

no code implementations • 6 Mar 2023 • Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang

To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.

Causal Inference regression +2

Paper
Add Code

CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity

no code implementations • 23 Sep 2023 • Pengyun Yue, Hanzhen Zhao, Cong Fang, Di He, LiWei Wang, Zhouchen Lin, Song-Chun Zhu

With distributed machine learning being a prominent technique for large-scale machine learning tasks, communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers.

Distributed Optimization

Paper
Add Code

Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

no code implementations • 6 Dec 2023 • Yuanshi Liu, Hanzhen Zhao, Yang Xu, Pengyun Yue, Cong Fang

In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning, including linear regression and beyond.

Paper
Add Code

The Implicit Bias of Heterogeneity towards Invariance and Causality

no code implementations • 3 Mar 2024 • Yang Xu, Yihong Gu, Cong Fang

It is observed empirically that the large language models (LLM), trained with a variant of regression loss using numerous corpus from the Internet, can unveil causal associations to some extent.

Causal Inference regression

Paper
Add Code

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

no code implementations • 19 Mar 2024 • Lirui Luo, Guoxi Zhang, Hongming Xu, Yaodong Yang, Cong Fang, Qing Li

In this paper, we present a framework that is capable of learning structured states and symbolic policies simultaneously, whose key idea is to overcome the efficiency bottleneck by distilling vision foundation models into a scalable perception module.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.