no code implementations • 7 Aug 2024 • Pengxiang Zhao, Hanyu Hu, Ping Li, Yi Zheng, Zhefeng Wang, Xiaoming Yuan
Pruning is a critical strategy for compressing trained large language models (LLMs), aiming at substantial memory conservation and computational acceleration without compromising performance.
no code implementations • 22 Mar 2024 • Pengxiang Zhao, Ping Li, Yingjie Gu, Yi Zheng, Stephan Ludger Kölker, Zhefeng Wang, Xiaoming Yuan
As deep learning models exponentially increase in size, optimizers such as Adam encounter significant memory consumption challenges due to the storage of first and second moment data.
1 code implementation • 23 Jan 2024 • Zimeng Wang, Zhiyang Dou, Rui Xu, Cheng Lin, YuAn Liu, Xiaoxiao Long, Shiqing Xin, Taku Komura, Xiaoming Yuan, Wenping Wang
We introduce Coverage Axis++, a novel and efficient approach to 3D shape skeletonization.
1 code implementation • 13 Aug 2023 • Ming-Chih Lai, Yongcun Song, Xiaoming Yuan, Hangrui Yue, Tianyou Zeng
We show that the physics-informed neural networks (PINNs), in combination with some recently developed discontinuity capturing neural networks, can be applied to solve optimal control problems subject to partial differential equations (PDEs) with interfaces and some control constraints.
no code implementations • 1 Jul 2023 • Yongcun Song, Xiaoming Yuan, Hangrui Yue
The accelerated primal-dual method with operator learning is mesh-free, numerically efficient, and scalable to different types of PDEs.
1 code implementation • 16 Feb 2023 • Yongcun Song, Xiaoming Yuan, Hangrui Yue
We study the combination of the alternating direction method of multipliers (ADMM) with physics-informed neural networks (PINNs) for a general class of nonsmooth partial differential equation (PDE)-constrained optimization problems, where additional regularization can be employed for constraints on the control or design variables.
no code implementations • 23 Feb 2022 • Chunhui Zhang, Xiaoming Yuan, Qianyun Zhang, Guangxu Zhu, Lei Cheng, Ning Zhang
To further adapt to both various data distributions and different types of devices with heterogeneous embedded hardware platforms, inspired by meta-learning, a Cluster Federated Direct Neural Architecture Search (CFDNAS) framework is proposed to achieve device-aware NAS, in the sense that each device can learn a tailored deep learning model for its particular data distribution and hardware constraint.
no code implementations • 9 Jan 2022 • Congpei An, Hao-Ning Wu, Xiaoming Yuan
The total variation (TV) regularization has phenomenally boosted various variational models for image processing tasks.
no code implementations • 15 Jun 2021 • Risheng Liu, Xuan Liu, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang
Bi-level optimization model is able to capture a wide range of complex learning tasks with practical interest.
1 code implementation • 16 Feb 2021 • Risheng Liu, Pan Mu, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang
In this work, we formulate BLOs from an optimistic bi-level viewpoint and establish a new gradient-based algorithmic framework, named Bi-level Descent Aggregation (BDA), to partially address the above issues.
no code implementations • 7 Jan 2021 • Roland Glowinski, Yongcun Song, Xiaoming Yuan, Hangrui Yue
However, due to the additional divergence-free constraint on the control variable and the nonlinear relation between the state and control variables, it is challenging to compute the gradient and the optimal stepsize at each CG iteration, and thus nontrivial to implement the CG method.
Optimization and Control 49M41, 35Q93, 49J20
no code implementations • 6 Nov 2020 • Chunhui Zhang, Yongyuan Liang, Xiaoming Yuan, Lei Cheng
To further adapt for various data distributions of clients, inspired by meta-learning, a cluster Federated Direct Neural Architecture Search (CFDNAS) framework is proposed to achieve client-aware NAS, in the sense that each client can learn a tailored deep learning model for its particular data distribution.
no code implementations • 27 Jun 2020 • Xingguo Li, Tuo Zhao, Xiaoming Yuan, Han Liu
This paper describes an R package named flare, which implements a family of new high dimensional regression methods (LAD Lasso, SQRT Lasso, $\ell_q$ Lasso, and Dantzig selector) and their extensions to sparse precision matrix estimation (TIGER and CLIME).
no code implementations • ICML 2020 • Risheng Liu, Pan Mu, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang
In recent years, a variety of gradient-based first-order methods have been developed to solve bi-level optimization problems for learning applications.
no code implementations • 6 Jul 2019 • Risheng Liu, Long Ma, Xiaoming Yuan, Shangzhi Zeng, Jin Zhang
This paper firstly proposes a convex bilevel optimization paradigm to formulate and optimize popular learning and vision problems in real-world scenarios.
no code implementations • 25 Apr 2019 • Wei Shen, Zhenhuan Yang, Yiming Ying, Xiaoming Yuan
From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses.
no code implementations • ICML 2017 • Zheng Xu, Gavin Taylor, Hao Li, Mario Figueiredo, Xiaoming Yuan, Tom Goldstein
The alternating direction method of multipliers (ADMM) is commonly used for distributed model fitting problems, but its performance and reliability depend strongly on user-defined penalty parameters.
no code implementations • CVPR 2017 • Zheng Xu, Mario A. T. Figueiredo, Xiaoming Yuan, Christoph Studer, Tom Goldstein
Relaxed ADMM is a generalization of ADMM that often achieves better performance, but its efficiency depends strongly on algorithm parameters that must be chosen by an expert user.
no code implementations • NeurIPS 2015 • Tom Goldstein, Min Li, Xiaoming Yuan
The alternating direction method of multipliers (ADMM) is an important tool for solving complex optimization problems, but it involves minimization sub-steps that are often difficult to solve efficiently.
1 code implementation • 2 May 2013 • Tom Goldstein, Min Li, Xiaoming Yuan, Ernie Esser, Richard Baraniuk
The Primal-Dual hybrid gradient (PDHG) method is a powerful optimization scheme that breaks complex problems into simple sub-steps.
Numerical Analysis 65K15 G.1.6