Search Results for author: Shangqian Gao

Found 27 papers, 8 papers with code

Unlocking Memorization in Large Language Models with Dynamic Soft Prompting

no code implementations20 Sep 2024 Zhepeng Wang, Runxue Bao, Yawen Wu, Jackson Taylor, Cao Xiao, Feng Zheng, Weiwen Jiang, Shangqian Gao, yanfu Zhang

Pretrained large language models (LLMs) have revolutionized natural language processing (NLP) tasks such as summarization, question answering, and translation.

Code Generation Memorization +2

MoDeGPT: Modular Decomposition for Large Language Model Compression

no code implementations19 Aug 2024 Chi-Heng Lin, Shangqian Gao, James Seale Smith, Abhishek Patel, Shikhar Tuli, Yilin Shen, Hongxia Jin, Yen-Chang Hsu

Large Language Models (LLMs) have reshaped the landscape of artificial intelligence by demonstrating exceptional performance across various tasks.

Language Modelling Large Language Model +1

Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models

1 code implementation17 Jun 2024 Alireza Ganjdanesh, Reza Shirkavand, Shangqian Gao, Heng Huang

Each architecture code represents a specialized model tailored to the prompts assigned to it, and the number of codes is a hyperparameter.

Contrastive Learning Image Generation

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

no code implementations CVPR 2024 Alireza Ganjdanesh, Shangqian Gao, Heng Huang

We address this challenge by designing a mechanism to model the complex changing dynamics of the reward function and provide a representation of it to the RL agent.

Reinforcement Learning (RL)

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch

1 code implementation CVPR 2024 Xidong Wu, Shangqian Gao, Zeyu Zhang, Zhenzhen Li, Runxue Bao, yanfu Zhang, Xiaoqian Wang, Heng Huang

Current techniques for deep neural network (DNN) pruning often involve intricate multi-step processes that require domain-specific expertise, making their widespread adoption challenging.

Network Pruning

Device-Wise Federated Network Pruning

1 code implementation CVPR 2024 Shangqian Gao, Junyi Li, Zeyu Zhang, yanfu Zhang, Weidong Cai, Heng Huang

Neural network pruning particularly channel pruning is a widely used technique for compressing deep learning models to enable their deployment on edge devices with limited resources.

Edge-computing Federated Learning +1

BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

no code implementations CVPR 2024 Shangqian Gao, yanfu Zhang, Feihu Huang, Heng Huang

Most existing dynamic or runtime channel pruning methods have to store all weights to achieve efficient inference which brings extra storage costs.

Token Fusion: Bridging the Gap between Token Pruning and Token Merging

no code implementations2 Dec 2023 Minchul Kim, Shangqian Gao, Yen-Chang Hsu, Yilin Shen, Hongxia Jin

In this paper, we introduce "Token Fusion" (ToFu), a method that amalgamates the benefits of both token pruning and token merging.

Computational Efficiency Image Generation

Structural Alignment for Network Pruning through Partial Regularization

no code implementations ICCV 2023 Shangqian Gao, Zeyu Zhang, yanfu Zhang, Feihu Huang, Heng Huang

To mitigate this gap, we first learn a target sub-network during the model training process, and then we use this sub-network to guide the learning of model weights through partial regularization.

Network Pruning

Interpretations Steered Network Pruning via Amortized Inferred Saliency Maps

1 code implementation7 Sep 2022 Alireza Ganjdanesh, Shangqian Gao, Heng Huang

To fill in this gap, we propose to address the channel pruning problem from a novel perspective by leveraging the interpretations of a model to steer the pruning process, thereby utilizing information from both inputs and outputs of the model.

Inductive Bias Network Pruning

Enhanced Bilevel Optimization via Bregman Distance

no code implementations26 Jul 2021 Feihu Huang, Junyi Li, Shangqian Gao, Heng Huang

Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) to solve deterministic bilevel problems, which achieves a lower computational complexity than the best known results.

Bilevel Optimization Hyperparameter Optimization +2

Bregman Gradient Policy Optimization

1 code implementation ICLR 2022 Feihu Huang, Shangqian Gao, Heng Huang

In the paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques.

reinforcement-learning Reinforcement Learning +1

BiAdam: Fast Adaptive Bilevel Optimization Methods

no code implementations21 Jun 2021 Feihu Huang, Junyi Li, Shangqian Gao

To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that the outer problem is possibly nonconvex and the inner problem is strongly convex.

Bilevel Optimization Meta-Learning +1

Network Pruning via Performance Maximization

1 code implementation CVPR 2021 Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang

Specifically, we train a stand-alone neural network to predict sub-networks' performance and then maximize the output of the network as a proxy of accuracy to guide pruning.

Model Compression Network Pruning

Adversarial Attack on Deep Cross-Modal Hamming Retrieval

no code implementations ICCV 2021 Chao Li, Shangqian Gao, Cheng Deng, Wei Liu, Heng Huang

Specifically, given a target model, we first construct its substitute model to exploit cross-modal correlations within hamming space, with which we create adversarial examples by limitedly querying from a target model.

Adversarial Attack Cross-Modal Retrieval +2

Exploration and Estimation for Model Compression

no code implementations ICCV 2021 yanfu Zhang, Shangqian Gao, Heng Huang

In this paper, we focus on the discrimination-aware compression of Convolutional Neural Networks (CNNs).

Model Compression

Model Compression via Hyper-Structure Network

no code implementations1 Jan 2021 Shangqian Gao, Feihu Huang, Heng Huang

In this paper, we propose a novel channel pruning method to solve the problem of compression and acceleration of Convolutional Neural Networks (CNNs).

Model Compression

Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds

no code implementations13 Oct 2020 Feihu Huang, Shangqian Gao

At the same time, we present an effective Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the stochastic minimax optimization, which has a sample complexity of $O(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary solution.

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

no code implementations18 Aug 2020 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(\kappa_y^{4. 5}\epsilon^{-3})$ without requiring large batches for finding an $\epsilon$-stationary point.

Adversarial Attack

Momentum-Based Policy Gradient Methods

1 code implementation ICML 2020 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

In particular, we present a non-adaptive version of IS-MBPG method, i. e., IS-MBPG*, which also reaches the best known sample complexity of $O(\epsilon^{-3})$ without any large batches.

Policy Gradient Methods

Discrete Model Compression With Resource Constraint for Deep Neural Networks

no code implementations CVPR 2020 Shangqian Gao, Feihu Huang, Jian Pei, Heng Huang

In this paper, we target to address the problem of compression and acceleration of Convolutional Neural Networks (CNNs).

Model Compression

Cross-Modal Learning with Adversarial Samples

1 code implementation NeurIPS 2019 Chao Li, Shangqian Gao, Cheng Deng, De Xie, Wei Liu

Extensive experiments on two cross-modal benchmark datasets show that the adversarial examples produced by our CMLA are efficient in fooling a target deep cross-modal hashing network.

Retrieval

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

no code implementations30 Jul 2019 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Zeroth-order (a. k. a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive.

Adversarial Attack

Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization

no code implementations29 May 2019 Feihu Huang, Shangqian Gao, Songcan Chen, Heng Huang

In particular, our methods not only reach the best convergence rate $O(1/T)$ for the nonconvex optimization, but also are able to effectively solve many complex machine learning problems with multiple regularized penalties and constraints.

Adversarial Attack BIG-bench Machine Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.