Search Results for author: Cong Fang

Found 26 papers, 8 papers with code

Relational Learning in Pre-Trained Models: A Theory from Hypergraph Recovery Perspective

no code implementations17 Jun 2024 Yang Chen, Cong Fang, Zhouchen Lin, Bing Liu

Foundation Models (FMs) have demonstrated remarkable insights into the relational dynamics of the world, leading to the crucial question: how do these models acquire an understanding of world hybrid relations?

Entity Alignment Relational Reasoning

Quantum Algorithms and Lower Bounds for Finite-Sum Optimization

no code implementations5 Jun 2024 Yexin Zhang, Chenyi Zhang, Cong Fang, LiWei Wang, Tongyang Li

In addition, when $F$ is nonconvex, our quantum algorithm can find an $\epsilon$-critial point using $\tilde{O}(n+\ell(d^{1/3}n^{1/3}+\sqrt{d})/\epsilon^2)$ queries.

On the Algorithmic Bias of Aligning Large Language Models with RLHF: Preference Collapse and Matching Regularization

1 code implementation26 May 2024 Jiancong Xiao, Ziniu Li, Xingyu Xie, Emily Getzen, Cong Fang, Qi Long, Weijie J. Su

To mitigate this algorithmic bias, we introduce preference matching (PM) RLHF, a novel approach that provably aligns LLMs with the preference distribution of the reward model under the Bradley--Terry--Luce/Plackett--Luce model.

Decision Making Text Generation

Causality Pursuit from Heterogeneous Environments via Neural Adversarial Invariance Learning

1 code implementation7 May 2024 Yihong Gu, Cong Fang, Peter Bühlmann, Jianqing Fan

The challenge of finding such an unknown set of quasi-causal or invariant variables is compounded by the presence of endogenous variables that have heterogeneous effects across different environments, including even one of them in the regression would make the estimation inconsistent.

regression Transfer Learning

End-to-End Neuro-Symbolic Reinforcement Learning with Textual Explanations

1 code implementation19 Mar 2024 Lirui Luo, Guoxi Zhang, Hongming Xu, Yaodong Yang, Cong Fang, Qing Li

Neuro-symbolic reinforcement learning (NS-RL) has emerged as a promising paradigm for explainable decision-making, characterized by the interpretability of symbolic policies.

Decision Making reinforcement-learning

The Implicit Bias of Heterogeneity towards Invariance and Causality

no code implementations3 Mar 2024 Yang Xu, Yihong Gu, Cong Fang

It is observed empirically that the large language models (LLM), trained with a variant of regression loss using numerous corpus from the Internet, can unveil causal associations to some extent.

Causal Inference regression

Accelerated Gradient Algorithms with Adaptive Subspace Search for Instance-Faster Optimization

no code implementations6 Dec 2023 Yuanshi Liu, Hanzhen Zhao, Yang Xu, Pengyun Yue, Cong Fang

In this paper, we open up a new way to design and analyze gradient-based algorithms with direct applications in machine learning, including linear regression and beyond.

CORE: Common Random Reconstruction for Distributed Optimization with Provable Low Communication Complexity

no code implementations23 Sep 2023 Pengyun Yue, Hanzhen Zhao, Cong Fang, Di He, LiWei Wang, Zhouchen Lin, Song-Chun Zhu

With distributed machine learning being a prominent technique for large-scale machine learning tasks, communication complexity has become a major bottleneck for speeding up training and scaling up machine numbers.

Distributed Optimization

Policy Representation via Diffusion Probability Model for Reinforcement Learning

1 code implementation22 May 2023 Long Yang, Zhixiong Huang, Fenghao Lei, Yucun Zhong, Yiming Yang, Cong Fang, Shiting Wen, Binbin Zhou, Zhouchen Lin

Popular reinforcement learning (RL) algorithms tend to produce a unimodal policy distribution, which weakens the expressiveness of complicated policy and decays the ability of exploration.

Continuous Control reinforcement-learning +1

Environment Invariant Linear Least Squares

no code implementations6 Mar 2023 Jianqing Fan, Cong Fang, Yihong Gu, Tong Zhang

To the best of our knowledge, this paper is the first to realize statistically efficient invariance learning in the general linear model.

Causal Inference regression +2

PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium

no code implementations2 Mar 2023 Shihong Ding, Hanze Dong, Cong Fang, Zhouchen Lin, Tong Zhang

To circumvent this difficulty, we examine the problem of identifying a mixed Nash equilibrium, where strategies are randomized and characterized by probability distributions over continuous domains. To this end, we propose PArticle-based Primal-dual ALgorithm (PAPAL) tailored for a weakly entropy-regularized min-max optimization over probability distributions.

Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training

1 code implementation29 Jan 2021 Cong Fang, Hangfeng He, Qi Long, Weijie J. Su

More importantly, when moving to the imbalanced case, our analysis of the Layer-Peeled Model reveals a hitherto unknown phenomenon that we term \textit{Minority Collapse}, which fundamentally limits the performance of deep learning models on the minority classes.

Mathematical Models of Overparameterized Neural Networks

1 code implementation27 Dec 2020 Cong Fang, Hanze Dong, Tong Zhang

Deep learning has received considerable empirical successes in recent years.

How to Characterize The Landscape of Overparameterized Convolutional Neural Networks

1 code implementation NeurIPS 2020 Yihong Gu, Weizhong Zhang, Cong Fang, Jason D. Lee, Tong Zhang

With the help of a new technique called {\it neural network grafting}, we demonstrate that even during the entire training process, feature distributions of differently initialized networks remain similar at each layer.

Improved Analysis of Clipping Algorithms for Non-convex Optimization

1 code implementation NeurIPS 2020 Bohang Zhang, Jikai Jin, Cong Fang, LiWei Wang

Gradient clipping is commonly used in training deep neural networks partly due to its practicability in relieving the exploding gradient problem.

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

no code implementations3 Jul 2020 Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang

This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization.

Convex Formulation of Overparameterized Deep Neural Networks

no code implementations18 Nov 2019 Cong Fang, Yihong Gu, Weizhong Zhang, Tong Zhang

This new analysis is consistent with empirical observations that deep neural networks are capable of learning efficient feature representations.

Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations

no code implementations25 Oct 2019 Cong Fang, Hanze Dong, Tong Zhang

Recently, over-parameterized neural networks have been extensively analyzed in the literature.

A Stochastic Trust Region Method for Non-convex Minimization

no code implementations ICLR 2020 Zebang Shen, Pan Zhou, Cong Fang, Alejandro Ribeiro

We target the problem of finding a local minimum in non-convex finite-sum minimization.

Sharp Analysis for Nonconvex SGD Escaping from Saddle Points

no code implementations1 Feb 2019 Cong Fang, Zhouchen Lin, Tong Zhang

In this paper, we give a sharp analysis for Stochastic Gradient Descent (SGD) and prove that SGD is able to efficiently escape from saddle points and find an $(\epsilon, O(\epsilon^{0. 5}))$-approximate second-order stationary point in $\tilde{O}(\epsilon^{-3. 5})$ stochastic gradient computations for generic nonconvex optimization problems, when the objective function satisfies gradient-Lipschitz, Hessian-Lipschitz, and dispersive noise assumptions.

Stochastic Optimization

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path-Integrated Differential Estimator

no code implementations NeurIPS 2018 Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

Specially, we prove that the SPIDER-SFO algorithm achieves a gradient computation cost of $\mathcal{O}\left( \min( n^{1/2} \epsilon^{-2}, \epsilon^{-3} ) \right)$ to find an $\epsilon$-approximate first-order stationary point.

Stochastic Optimization

Lifted Proximal Operator Machines

no code implementations5 Nov 2018 Jia Li, Cong Fang, Zhouchen Lin

LPOM is block multi-convex in all layer-wise weights and activations.

SPIDER: Near-Optimal Non-Convex Optimization via Stochastic Path Integrated Differential Estimator

no code implementations NeurIPS 2018 Cong Fang, Chris Junchi Li, Zhouchen Lin, Tong Zhang

For stochastic first-order method, combining SPIDER with normalized gradient descent, we propose two new algorithms, namely SPIDER-SFO and SPIDER-SFO\textsuperscript{+}, that solve non-convex stochastic optimization problems using stochastic gradients only.

Stochastic Optimization

Accelerating Asynchronous Algorithms for Convex Optimization by Momentum Compensation

no code implementations27 Feb 2018 Cong Fang, Yameng Huang, Zhouchen Lin

$O(1/\epsilon)$) convergence rate for non-strongly convex functions, and $O(\sqrt{\kappa}\log(1/\epsilon))$ (v. s.

Cannot find the paper you are looking for? You can Submit a new open access paper.