Search Results for author: Guangyu Sun

Found 27 papers, 8 papers with code

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search

no code implementations ECCV 2020 Zhihang Yuan, Bingzhe Wu, Guangyu Sun, Zheng Liang, Shiwan Zhao, Weichen Bi

To this end, based on a given CNN model, we first generate a CNN architecture space in which each architecture is a multi-stage CNN generated from the given model using some predefined transformations.

Neural Architecture Search

Towards Multi-modal Transformers in Federated Learning

no code implementations18 Apr 2024 Guangyu Sun, Matias Mendieta, Aritra Dutta, Xin Li, Chen Chen

Multi-modal transformers mark significant progress in different domains, but siloed high-quality data hinders their further improvement.

LLM Inference Unveiled: Survey and Roofline Model Insights

2 code implementations26 Feb 2024 Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

Knowledge Distillation Language Modelling +3

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models

1 code implementation10 Dec 2023 Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun

This paper explores a new post-hoc training-free compression paradigm for compressing Large Language Models (LLMs) to facilitate their wider adoption in various computing environments.

FedPerfix: Towards Partial Model Personalization of Vision Transformers in Federated Learning

1 code implementation ICCV 2023 Guangyu Sun, Matias Mendieta, Jun Luo, Shandong Wu, Chen Chen

Personalized Federated Learning (PFL) represents a promising solution for decentralized learning in heterogeneous data environments.

Personalized Federated Learning

RPTQ: Reorder-based Post-training Quantization for Large Language Models

1 code implementation3 Apr 2023 Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers.


Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance

no code implementations23 Mar 2023 Zhihang Yuan, Jiawei Liu, Jiaxiang Wu, Dawei Yang, Qiang Wu, Guangyu Sun, Wenyu Liu, Xinggang Wang, Bingzhe Wu

Post-training quantization (PTQ) is a popular method for compressing deep neural networks (DNNs) without modifying their original architecture or training procedures.

Benchmarking Data Augmentation +1

Latency-aware Spatial-wise Dynamic Networks

2 code implementations12 Oct 2022 Yizeng Han, Zhihang Yuan, Yifan Pu, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang

The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties.

Image Classification Instance Segmentation +4

Conquering the Communication Constraints to Enable Large Pre-Trained Models in Federated Learning

no code implementations4 Oct 2022 Guangyu Sun, Umar Khalid, Matias Mendieta, Taojiannan Yang, Chen Chen

Recently, the use of small pre-trained models has been shown effective in federated learning optimization and improving convergence.

Federated Learning

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers with Twin Uniform Quantization

1 code implementation24 Nov 2021 Zhihang Yuan, Chenhao Xue, Yiqi Chen, Qiang Wu, Guangyu Sun

We observe the distributions of activation values after softmax and GELU functions are quite different from the Gaussian distribution.


Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention

no code implementations18 Oct 2021 Zhe Zhou, Junlin Liu, Zhenyu Gu, Guangyu Sun

To enable such an algorithm with lower latency and better energy efficiency, we also propose an Energon co-processor architecture.


GNNSampler: Bridging the Gap between Sampling Algorithms of GNN and Hardware

1 code implementation26 Aug 2021 Xin Liu, Mingyu Yan, Shuhan Song, Zhengyang Lv, WenMing Li, Guangyu Sun, Xiaochun Ye, Dongrui Fan

Extensive experiments show that our method is universal to mainstream sampling algorithms and helps significantly reduce the training time, especially in large-scale graphs.

BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices

no code implementations13 Apr 2021 Zhe Zhou, Bizhao Shi, Zhe Zhang, Yijin Guan, Guangyu Sun, Guojie Luo

At the hardware design level, we propose a pipelined CirCore architecture, which supports efficient block-circulant matrices computation.


ENAS4D: Efficient Multi-stage CNN Architecture Search for Dynamic Inference

no code implementations19 Sep 2020 Zhihang Yuan, Xin Liu, Bingzhe Wu, Guangyu Sun

The inference of a input sample can exit from early stage if the prediction of the stage is confident enough.

S2DNAS:Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search

no code implementations16 Nov 2019 Zhihang Yuan, Bingzhe Wu, Zheng Liang, Shiwan Zhao, Weichen Bi, Guangyu Sun

Recently, dynamic inference has emerged as a promising way to reduce the computational cost of deep convolutional neural network (CNN).

Neural Architecture Search

Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics

no code implementations5 Oct 2019 Bingzhe Wu, Chaochao Chen, Shiwan Zhao, Cen Chen, Yuan YAO, Guangyu Sun, Li Wang, Xiaolu Zhang, Jun Zhou

Based on this framework, we demonstrate that SGLD can prevent the information leakage of the training dataset to a certain extent.

Generalization Bounds

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection

no code implementations NeurIPS 2019 Bingzhe Wu, Shiwan Zhao, Chaochao Chen, Haoyang Xu, Li Wang, Xiaolu Zhang, Guangyu Sun, Jun Zhou

In this paper, we aim to understand the generalization properties of generative adversarial networks (GANs) from a new perspective of privacy protection.

BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference

no code implementations3 Jun 2019 Peichen Xie, Bingzhe Wu, Guangyu Sun

Specifically, we use homomorphic encryption to protect a client's raw data and use Bayesian neural networks to protect the DNN weights in a cloud server.

Privacy Preserving

G2C: A Generator-to-Classifier Framework Integrating Multi-Stained Visual Cues for Pathological Glomerulus Classification

no code implementations30 Jun 2018 Bingzhe Wu, Xiaolu Zhang, Shiwan Zhao, Lingxi Xie, Caihong Zeng, Zhihong Liu, Guangyu Sun

Given an input image from a specified stain, several generators are first applied to estimate its appearances in other staining methods, and a classifier follows to combine visual cues from different stains for prediction (whether it is pathological, or which type of pathology it has).

Classification Decision Making +2

Cannot find the paper you are looking for? You can Submit a new open access paper.