no code implementations • 7 Aug 2023 • Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li
With the proposed integer quantization search, we increase the accuracy of ResNet-18 on ImageNet by 1. 31% points and ResNet-50 by 0. 90% points with equivalent model cost over previous methods.
no code implementations • 16 Feb 2023 • Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang
Specifically, the schedule works on a PyTorch model and uses a set of schedule primitives to convert the model for common model training optimizations such as high-performance kernels, effective 3D parallelism, and efficient activation checkpointing.
no code implementations • 9 Feb 2023 • Yichi Zhang, Ankush Garg, Yuan Cao, Łukasz Lew, Behrooz Ghorbani, Zhiru Zhang, Orhan Firat
In this work, we propose a novel binarization technique for Transformers applied to machine translation (BMT), the first of its kind.
no code implementations • 25 Jul 2022 • Yuwei Hu, Jiajie Li, Zhongming Yu, Zhiru Zhang
To understand whether persistent memory is a good fit for GNNRecSys training, we perform an in-depth characterization of GNNRecSys workloads and a comprehensive analysis of their performance on a persistent memory device, namely, Intel Optane.
no code implementations • 4 Mar 2022 • Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang
In addition, since PreCropping compresses CNNs at initialization, the computational and memory costs of CNNs are reduced for both training and inference on commodity hardware.
1 code implementation • 10 Feb 2022 • Tao Yu, Yichi Zhang, Zhiru Zhang, Christopher De Sa
Using representation theory, we characterize which similarity matrices can be "expressed" by finite group VSA hypervectors, and we show how these VSAs can be constructed.
1 code implementation • 30 Jan 2022 • Chenhui Deng, Xiuyu Li, Zhuo Feng, Zhiru Zhang
Graph neural networks (GNNs) have been increasingly deployed in various applications that involve learning on non-Euclidean data.
1 code implementation • CVPR 2022 • Yichi Zhang, Zhiru Zhang, Lukasz Lew
In order to enable joint optimization of the cost together with accuracy, we define arithmetic computation effort (ACE), a hardware- and energy-inspired cost metric for quantized and binarized networks.
Ranked #1 on
Binarization
on ImageNet
no code implementations • NeurIPS 2021 • Weizhe Hua, Yichi Zhang, Chuan Guo, Zhiru Zhang, G. Edward Suh
Neural network robustness has become a central topic in machine learning in recent years.
no code implementations • 29 Sep 2021 • Chenhui Deng, Xiuyu Li, Zhuo Feng, Zhiru Zhang
In this paper, we propose GARNET, a scalable spectral method to boost the adversarial robustness of GNN models for both homophilic and heterophilic graphs.
no code implementations • 16 Sep 2021 • Mark Buckler, Neil Adit, Yuwei Hu, Zhiru Zhang, Adrian Sampson
Our key insights are that 1) pointwise convolutions commute with frequency transformation and thus can be computed in the frequency domain without modification, 2) each channel within a given layer has a different level of sensitivity to frequency domain pruning, and 3) each channel's sensitivity to frequency pruning is approximately monotonic with respect to frequency.
no code implementations • 25 Mar 2021 • Cong Hao, Jordan Dotzel, JinJun Xiong, Luca Benini, Zhiru Zhang, Deming Chen
Artificial intelligence (AI) technologies have dramatically advanced in recent years, resulting in revolutionary changes in people's lives.
1 code implementation • 7 Feb 2021 • Wuxinlin Cheng, Chenhui Deng, Zhiqiang Zhao, Yaohui Cai, Zhiru Zhang, Zhuo Feng
A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model.
2 code implementations • 22 Dec 2020 • Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang
We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations.
1 code implementation • 4 Dec 2020 • Shubham Rai, Walter Lau Neto, Yukio Miyasaka, Xinpei Zhang, Mingfei Yu, Qingyang Yi Masahiro Fujita, Guilherme B. Manske, Matheus F. Pontes, Leomar S. da Rosa Junior, Marilton S. de Aguiar, Paulo F. Butzen, Po-Chun Chien, Yu-Shan Huang, Hoa-Ren Wang, Jie-Hong R. Jiang, Jiaqi Gu, Zheng Zhao, Zixuan Jiang, David Z. Pan, Brunno A. de Abreu, Isac de Souza Campos, Augusto Berndt, Cristina Meinhardt, Jonata T. Carvalho, Mateus Grellert, Sergio Bampi, Aditya Lohana, Akash Kumar, Wei Zeng, Azadeh Davoodi, Rasit O. Topaloglu, Yuan Zhou, Jordan Dotzel, Yichi Zhang, Hanyu Wang, Zhiru Zhang, Valerio Tenace, Pierre-Emmanuel Gaillardon, Alan Mishchenko, Satrajit Chatterjee
If the function is incompletely-specified, the implementation has to be true only on the care set.
no code implementations • 26 Aug 2020 • Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh
This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment.
no code implementations • 26 Aug 2020 • Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang
FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge.
no code implementations • 20 Apr 2020 • Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh
This paper introduces MGX, a near-zero overhead memory protection scheme for hardware accelerators.
1 code implementation • ICLR 2020 • Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang
The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execution with almost no accuracy loss.
no code implementations • 13 Oct 2019 • Ritchie Zhao, Jordan Dotzel, Zhanqiu Hu, Preslav Ivanov, Christopher De Sa, Zhiru Zhang
Specialized hardware for handling activation outliers can enable low-precision neural networks, but at the cost of nontrivial area overhead.
1 code implementation • ICLR 2020 • Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, Zhuo Feng
GraphZoom first performs graph fusion to generate a new graph that effectively encodes the topology of the original graph and the node attribute information.
no code implementations • 15 Apr 2019 • Cunxi Yu, Zhiru Zhang
Physical design process commonly consumes hours to days for large designs, and routing is known as the most critical step.
3 code implementations • 28 Jan 2019 • Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang
The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.
no code implementations • CVPR 2019 • Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang
UGConvs generalize two disparate ideas in CNN architecture, channel shuffling (i. e. ShuffleNet) and block-circulant networks (i. e. CirCNN), and provide unifying insights that lead to a deeper understanding of each technique.
1 code implementation • NeurIPS 2019 • Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, G. Edward Suh
Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2. 6$\times$ without accuracy drop on ImageNet.
no code implementations • 15 Jul 2017 • Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta
State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution.