Search Results for author: Hongzheng Chen

Found 6 papers, 2 papers with code

Allo: A Programming Model for Composable Accelerator Design

2 code implementations • 7 Apr 2024 • Hongzheng Chen, Niansong Zhang, Shaojie Xiang, Zhichen Zeng, Mengjia Dai, Zhiru Zhang

For the GPT2 model, the inference latency of the Allo generated accelerator is 1. 7x faster than the NVIDIA A100 GPU with 5. 4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.

Paper
Code

Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference

no code implementations • 23 Dec 2023 • Hongzheng Chen, Jiahao Zhang, Yixiao Du, Shaojie Xiang, Zichao Yue, Niansong Zhang, Yaohui Cai, Zhiru Zhang

Experimental results demonstrate our approach can achieve up to 13. 4x speedup when compared to previous FPGA-based accelerators for the BERT model.

Language Modelling Large Language Model

Paper
Add Code

Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training

no code implementations • 16 Feb 2023 • Hongzheng Chen, Cody Hao Yu, Shuai Zheng, Zhen Zhang, Zhiru Zhang, Yida Wang

Specifically, Slapo works on a PyTorch model and uses a set of schedule primitives to convert the model for common model training optimizations such as high-performance kernels, effective 3D parallelism, and efficient activation checkpointing.

Scheduling

Paper
Add Code

Structured Pruning is All You Need for Pruning CNNs at Initialization

no code implementations • 4 Mar 2022 • Yaohui Cai, Weizhe Hua, Hongzheng Chen, G. Edward Suh, Christopher De Sa, Zhiru Zhang

In addition, since PreCropping compresses CNNs at initialization, the computational and memory costs of CNNs are reduced for both training and inference on commodity hardware.

Model Compression

Paper
Add Code

BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing

no code implementations • 16 Dec 2021 • Tianfeng Liu, Yangrui Chen, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo

Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 20. 68x on average.

Graph Property Prediction Node Classification +1

Paper
Add Code

FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations

2 code implementations • 22 Dec 2020 • Yichi Zhang, Junhao Pan, Xinheng Liu, Hongzheng Chen, Deming Chen, Zhiru Zhang

We design an efficient FPGA-based accelerator for our novel BNN model that supports the fractional activations.

Binarization Image Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.