Search Results for author: Zhiru Zhang

Found 35 papers, 15 papers with code

Binarized Convolutional Neural Networks with Separable Filters for Efficient Hardware Acceleration

no code implementations • 15 Jul 2017 • Jeng-Hau Lin, Tianwei Xing, Ritchie Zhao, Zhiru Zhang, Mani Srivastava, Zhuowen Tu, Rajesh K. Gupta

State-of-the-art convolutional neural networks are enormously costly in both compute and memory, demanding massively parallel GPUs for execution.

Paper
Add Code

Channel Gating Neural Networks

1 code implementation • NeurIPS 2019 • Weizhe Hua, Yuan Zhou, Christopher De Sa, Zhiru Zhang, G. Edward Suh

Combining our method with knowledge distillation reduces the compute cost of ResNet-18 by 2. 6$\times$ without accuracy drop on ImageNet.

Knowledge Distillation Network Pruning

Paper
Code

Building Efficient Deep Neural Networks with Unitary Group Convolutions

no code implementations • CVPR 2019 • Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang

UGConvs generalize two disparate ideas in CNN architecture, channel shuffling (i. e. ShuffleNet) and block-circulant networks (i. e. CirCNN), and provide unifying insights that lead to a deeper understanding of each technique.

Paper
Add Code

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

3 code implementations • 28 Jan 2019 • Ritchie Zhao, Yuwei Hu, Jordan Dotzel, Christopher De Sa, Zhiru Zhang

The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training.

Language Modelling Neural Network Compression +1

4,302

Paper
Code

Painting on Placement: Forecasting Routing Congestion using Conditional Generative Adversarial Nets

no code implementations • 15 Apr 2019 • Cunxi Yu, Zhiru Zhang

Physical design process commonly consumes hours to days for large designs, and routing is known as the most critical step.

Colorization Translation

Paper
Add Code

GraphZoom: A multi-level spectral approach for accurate and scalable graph embedding

1 code implementation • ICLR 2020 • Chenhui Deng, Zhiqiang Zhao, Yongyu Wang, Zhiru Zhang, Zhuo Feng

GraphZoom first performs graph fusion to generate a new graph that effectively encodes the topology of the original graph and the node attribute information.

Attribute Graph Embedding

108

Paper
Code

OverQ: Opportunistic Outlier Quantization for Neural Network Accelerators

no code implementations • 13 Oct 2019 • Ritchie Zhao, Jordan Dotzel, Zhanqiu Hu, Preslav Ivanov, Christopher De Sa, Zhiru Zhang

Specialized hardware for handling activation outliers can enable low-precision neural networks, but at the cost of nontrivial area overhead.

Quantization

Paper
Add Code

Precision Gating: Improving Neural Network Efficiency with Dynamic Dual-Precision Activations

1 code implementation • ICLR 2020 • Yichi Zhang, Ritchie Zhao, Weizhe Hua, Nayun Xu, G. Edward Suh, Zhiru Zhang

The proposed approach is applicable to a variety of DNN architectures and significantly reduces the computational cost of DNN execution with almost no accuracy loss.

Quantization

Paper
Code

MGX: Near-Zero Overhead Memory Protection for Data-Intensive Accelerators

no code implementations • 20 Apr 2020 • Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

This paper introduces MGX, a near-zero overhead memory protection scheme for hardware accelerators.

Paper
Add Code

GuardNN: Secure Accelerator Architecture for Privacy-Preserving Deep Learning

no code implementations • 26 Aug 2020 • Weizhe Hua, Muhammad Umar, Zhiru Zhang, G. Edward Suh

This paper proposes GuardNN, a secure DNN accelerator that provides hardware-based protection for user data and model parameters even in an untrusted environment.

Privacy Preserving Privacy Preserving Deep Learning

Paper
Add Code

FeatGraph: A Flexible and Efficient Backend for Graph Neural Network Systems

no code implementations • 26 Aug 2020 • Yuwei Hu, Zihao Ye, Minjie Wang, Jiali Yu, Da Zheng, Mu Li, Zheng Zhang, Zhiru Zhang, Yida Wang

FeatGraph provides a flexible programming interface to express diverse GNN models by composing coarse-grained sparse templates with fine-grained user-defined functions (UDFs) on each vertex/edge.

Paper
Add Code

Logic Synthesis Meets Machine Learning: Trading Exactness for Generalization

1 code implementation • 4 Dec 2020 • Shubham Rai, Walter Lau Neto, Yukio Miyasaka, Xinpei Zhang, Mingfei Yu, Qingyang Yi Masahiro Fujita, Guilherme B. Manske, Matheus F. Pontes, Leomar S. da Rosa Junior, Marilton S. de Aguiar, Paulo F. Butzen, Po-Chun Chien, Yu-Shan Huang, Hoa-Ren Wang, Jie-Hong R. Jiang, Jiaqi Gu, Zheng Zhao, Zixuan Jiang, David Z. Pan, Brunno A. de Abreu, Isac de Souza Campos, Augusto Berndt, Cristina Meinhardt, Jonata T. Carvalho, Mateus Grellert, Sergio Bampi, Aditya Lohana, Akash Kumar, Wei Zeng, Azadeh Davoodi, Rasit O. Topaloglu, Yuan Zhou, Jordan Dotzel, Yichi Zhang, Hanyu Wang, Zhiru Zhang, Valerio Tenace, Pierre-Emmanuel Gaillardon, Alan Mishchenko, Satrajit Chatterjee

If the function is incompletely-specified, the implementation has to be true only on the care set.

BIG-bench Machine Learning

Paper
Code

FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations

2 code implementations • 22 Dec 2020 • Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang

We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations.

Binarization Image Classification

Paper
Code

SPADE: A Spectral Method for Black-Box Adversarial Robustness Evaluation

2 code implementations • 7 Feb 2021 • Wuxinlin Cheng, Chenhui Deng, Zhiqiang Zhao, Yaohui Cai, Zhiru Zhang, Zhuo Feng

A black-box spectral method is introduced for evaluating the adversarial robustness of a given machine learning (ML) model.

Adversarial Robustness Graph Embedding

Paper
Code

Enabling Design Methodologies and Future Trends for Edge AI: Specialization and Co-design

no code implementations • 25 Mar 2021 • Cong Hao, Jordan Dotzel, JinJun Xiong, Luca Benini, Zhiru Zhang, Deming Chen

Artificial intelligence (AI) technologies have dramatically advanced in recent years, resulting in revolutionary changes in people's lives.

Benchmarking Edge-computing

Paper
Add Code

Dense Pruning of Pointwise Convolutions in the Frequency Domain

no code implementations • 16 Sep 2021 • Mark Buckler, Neil Adit, Yuwei Hu, Zhiru Zhang, Adrian Sampson

Our key insights are that 1) pointwise convolutions commute with frequency transformation and thus can be computed in the frequency domain without modification, 2) each channel within a given layer has a different level of sensitivity to frequency domain pruning, and 3) each channel's sensitivity to frequency pruning is approximately monotonic with respect to frequency.

Paper
Add Code

GARNET: A Spectral Approach to Robust and Scalable Graph Neural Networks

no code implementations • 29 Sep 2021 • Chenhui Deng, Xiuyu Li, Zhuo Feng, Zhiru Zhang

In this paper, we propose GARNET, a scalable spectral method to boost the adversarial robustness of GNN models for both homophilic and heterophilic graphs.

Adversarial Robustness Graph Embedding

Paper
Add Code

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

no code implementations • NeurIPS 2021 • Weizhe Hua, Yichi Zhang, Chuan Guo, Zhiru Zhang, G. Edward Suh

Neural network robustness has become a central topic in machine learning in recent years.

Paper
Add Code

PokeBNN: A Binary Pursuit of Lightweight Accuracy

1 code implementation • CVPR 2022 • Yichi Zhang, Zhiru Zhang, Lukasz Lew

In order to enable joint optimization of the cost together with accuracy, we define arithmetic computation effort (ACE), a hardware- and energy-inspired cost metric for quantized and binarized networks.

Ranked #1 on Binarization on ImageNet

Binarization

165

Paper
Code

GARNET: Reduced-Rank Topology Learning for Robust and Scalable Graph Neural Networks

1 code implementation • 30 Jan 2022 • Chenhui Deng, Xiuyu Li, Zhuo Feng, Zhiru Zhang

Graph neural networks (GNNs) have been increasingly deployed in various applications that involve learning on non-Euclidean data.

Adversarial Robustness

Paper
Code

Understanding Hyperdimensional Computing for Parallel Single-Pass Learning

1 code implementation • 10 Feb 2022 • Tao Yu, Yichi Zhang, Zhiru Zhang, Christopher De Sa

Using representation theory, we characterize which similarity matrices can be "expressed" by finite group VSA hypervectors, and we show how these VSAs can be constructed.

Paper
Code

Structured Pruning is All You Need for Pruning CNNs at Initialization

no code implementations • 4 Mar 2022 • Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang

In addition, since PreCropping compresses CNNs at initialization, the computational and memory costs of CNNs are reduced for both training and inference on commodity hardware.

Model Compression

Paper
Add Code

Analysis and Optimization of GNN-Based Recommender Systems on Persistent Memory

no code implementations • 25 Jul 2022 • Yuwei Hu, Jiajie Li, Zhongming Yu, Zhiru Zhang

To understand whether persistent memory is a good fit for GNNRecSys training, we perform an in-depth characterization of GNNRecSys workloads and a comprehensive analysis of their performance on a persistent memory device, namely, Intel Optane.

Link Prediction Recommendation Systems

Paper
Add Code

Binarized Neural Machine Translation

1 code implementation • NeurIPS 2023 • Yichi Zhang, Ankush Garg, Yuan Cao, Łukasz Lew, Behrooz Ghorbani, Zhiru Zhang, Orhan Firat

In this work, we propose a novel binarization technique for Transformers applied to machine translation (BMT), the first of its kind.

Binarization Machine Translation +2

165

Paper
Code

Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training

no code implementations • 16 Feb 2023 • Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang

Specifically, Slapo works on a PyTorch model and uses a set of schedule primitives to convert the model for common model training optimizations such as high-performance kernels, effective 3D parallelism, and efficient activation checkpointing.

Scheduling

Paper
Add Code

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search

no code implementations • 7 Aug 2023 • Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li

With the proposed integer quantization search, we increase the accuracy of ResNet-18 on ImageNet by 1. 31% points and ResNet-50 by 0. 90% points with equivalent model cost over previous methods.

Quantization

Paper
Add Code

Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference

no code implementations • 23 Dec 2023 • Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang

Experimental results demonstrate our approach can achieve up to 13. 4x speedup when compared to previous FPGA-based accelerators for the BERT model.

Language Modelling Large Language Model

Paper
Add Code

Trainable Fixed-Point Quantization for Deep Learning Acceleration on FPGAs

no code implementations • 31 Jan 2024 • Dingyi Dai, Yichi Zhang, Jiahao Zhang, Zhanqiu Hu, Yaohui Cai, Qi Sun, Zhiru Zhang

Quantization is a crucial technique for deploying deep learning models on resource-constrained devices, such as embedded FPGAs.

Quantization

Paper
Add Code

SAGMAN: Stability Analysis of Graph Neural Networks on the Manifolds

no code implementations • 13 Feb 2024 • Wuxinlin Cheng, Chenhui Deng, Ali Aghdaei, Zhiru Zhang, Zhuo Feng

Modern graph neural networks (GNNs) can be sensitive to changes in the input graph structure and node features, potentially resulting in unpredictable behavior and degraded performance.

Dimensionality Reduction Graph Embedding +1

Paper
Add Code

Exploring the Limits of Semantic Image Compression at Micro-bits per Pixel

no code implementations • 21 Feb 2024 • Jordan Dotzel, Bahaa Kotb, James Dotzel, Mohamed Abdelfattah, Zhiru Zhang

Traditional methods, such as JPEG, perform image compression by operating on structural information, such as pixel values or frequency content.

Image Compression

Paper
Add Code

Polynormer: Polynomial-Expressive Graph Transformer in Linear Time

2 code implementations • 2 Mar 2024 • Chenhui Deng, Zichao Yue, Zhiru Zhang

To enable the base model permutation equivariant, we integrate it with graph topology and node features separately, resulting in local and global equivariant attention models.

Ranked #1 on Node Classification on pokec

Node Classification

Paper
Code

Less is More: Hop-Wise Graph Attention for Scalable and Generalizable Learning on Circuits

1 code implementation • 2 Mar 2024 • Chenhui Deng, Zichao Yue, Cunxi Yu, Gokce Sarar, Ryan Carey, Rajeev Jain, Zhiru Zhang

In this work we propose HOGA, a novel attention-based model for learning circuit representations in a scalable and generalizable manner.

Graph Attention

Paper
Code

UniSparse: An Intermediate Language for General Sparse Format Customization

1 code implementation • 9 Mar 2024 • Jie Liu, Zhongyuan Zhao, Zijian Ding, Benjamin Brock, Hongbo Rong, Zhiru Zhang

The ongoing trend of hardware specialization has led to a growing use of custom data formats when processing sparse workloads, which are typically memory-bound.

Attribute Code Generation

Paper
Code

Allo: A Programming Model for Composable Accelerator Design

2 code implementations • 7 Apr 2024 • Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang

For the GPT2 model, the inference latency of the Allo generated accelerator is 1. 7x faster than the NVIDIA A100 GPU with 5. 4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.

Paper
Code

Radial Networks: Dynamic Layer Routing for High-Performance Large Language Models

no code implementations • 7 Apr 2024 • Jordan Dotzel, Yash Akhauri, Ahmed S. AbouElhamayed, Carly Jiang, Mohamed Abdelfattah, Zhiru Zhang

In this work, we explore the practicality of layer sparsity by profiling residual connections and establish the relationship between model depth and layer sparsity.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.