no code implementations • 13 Apr 2024 • Yun Ma, Yihong Wu, Pengkun Yang
We consider the problem of approximating a general Gaussian location mixture by finite mixtures.
no code implementations • 16 Aug 2023 • Pengkun Yang, Jingzhao Zhang
We show that a scaling law can have two phases: in the first phase, the generalization error depends polynomially on the data dimension and decreases fast; whereas in the second phase, the error depends exponentially on the data dimension and decreases slowly.
no code implementations • 31 May 2023 • Lili Su, Ming Xiang, Jiaming Xu, Pengkun Yang
Federated learning is a decentralized machine learning framework that enables collaborative model training without revealing raw data.
no code implementations • 15 Jun 2022 • Lili Su, Jiaming Xu, Pengkun Yang
This paper studies the problem of model training under Federated Learning when clients exhibit cluster structure.
no code implementations • 26 May 2022 • Xingjian Li, Pengkun Yang, Yangcheng Gu, Xueying Zhan, Tianyang Wang, Min Xu, Chengzhong Xu
We provide theoretical analyses by leveraging the small Gaussian noise theory and demonstrate that our method favors a subset with large and diverse gradients.
1 code implementation • 10 Dec 2021 • Tianyang Wang, Xingjian Li, Pengkun Yang, Guosheng Hu, Xiangrui Zeng, Siyu Huang, Cheng-Zhong Xu, Min Xu
In this work, we explore such an impact by theoretically proving that selecting unlabeled data of higher gradient norm leads to a lower upper-bound of test loss, resulting in better test performance.
no code implementations • 29 Jun 2021 • Lili Su, Jiaming Xu, Pengkun Yang
We discover that when the data heterogeneity is moderate, a client with limited local data can benefit from a common model with a large federation gain.
no code implementations • 3 Jul 2020 • Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang
This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization.
no code implementations • 14 Feb 2020 • Natalie Doss, Yihong Wu, Pengkun Yang, Harrison H. Zhou
This paper studies the optimal rate of estimation in a finite Gaussian location mixture model in high dimensions without separation conditions.
no code implementations • NeurIPS 2019 • Lili Su, Pengkun Yang
When the network is sufficiently over-parameterized, these matrices individually approximate {\em an} integral operator which is determined by the feature vector distribution $\rho$ only.