Search Results for author: Weizhe Hua

Found 10 papers, 3 papers with code

Structured Pruning is All You Need for Pruning CNNs at Initialization

no code implementations4 Mar 2022 Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang

In addition, since PreCropping compresses CNNs at initialization, the computational and memory costs of CNNs are reduced for both training and inference on commodity hardware.

Model Compression

Transformer Quality in Linear Time

1 code implementation21 Feb 2022 Weizhe Hua, Zihang Dai, Hanxiao Liu, Quoc V. Le

We revisit the design choices in Transformers, and propose methods to address their weaknesses in handling long sequences.

8k Language Modelling +1

Sinan: Data-Driven, QoS-Aware Cluster Management for Microservices

no code implementations27 May 2021 Yanqi Zhang, Weizhe Hua, Zhuangzhuang Zhou, Edward Suh, Christina Delimitrou

Cloud applications are increasingly shifting from large monolithic services, to large numbers of loosely-coupled, specialized microservices.

Management

Contrastive Weight Regularization for Large Minibatch SGD

no code implementations17 Nov 2020 Qiwei Yuan, Weizhe Hua, Yi Zhou, Cunxi Yu

The minibatch stochastic gradient descent method (SGD) is widely applied in deep learning due to its efficiency and scalability that enable training deep networks with a large volume of data.

GuardNN: Secure Accelerator Architecture for Privacy-Preserving Deep Learning

no code implementations26 Aug 2020 Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment.

Privacy Preserving Privacy Preserving Deep Learning

MGX: Near-Zero Overhead Memory Protection for Data-Intensive Accelerators

no code implementations20 Apr 2020 Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

This paper introduces MGX, a near-zero overhead memory protection scheme for hardware accelerators.

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

1 code implementation ICLR 2020 Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang

The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execution with almost no accuracy loss.

Quantization

Channel Gating Neural Networks

1 code implementation NeurIPS 2019 Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, G. Edward Suh

Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2. 6$\times$ without accuracy drop on ImageNet.

Knowledge Distillation Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.