Search Results for author: Jiannan Tian

Found 6 papers, 3 papers with code

DeepSZ: A Novel Framework to Compress Deep Neural Networks by Using Error-Bounded Lossy Compression

1 code implementation26 Jan 2019 Sian Jin, Sheng Di, Xin Liang, Jiannan Tian, Dingwen Tao, Franck Cappello

In this paper, we propose DeepSZ: an accuracy-loss bounded neural network compression framework, which involves four key steps: network pruning, error bound assessment, optimization for error bound configuration, and compressed model generation, featuring a high compression ratio and low encoding time.

Network Pruning Neural Network Compression

cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data

2 code implementations19 Jul 2020 Jiannan Tian, Sheng Di, Kai Zhao, Cody Rivera, Megan Hickman Fulp, Robert Underwood, Sian Jin, Xin Liang, Jon Calhoun, Dingwen Tao, Franck Cappello

To the best of our knowledge, cuSZ is the first error-bounded lossy compressor on GPUs for scientific data.

Distributed, Parallel, and Cluster Computing

ClickTrain: Efficient and Accurate End-to-End Deep Learning Training via Fine-Grained Architecture-Preserving Pruning

no code implementations20 Nov 2020 Chengming Zhang, Geng Yuan, Wei Niu, Jiannan Tian, Sian Jin, Donglin Zhuang, Zhe Jiang, Yanzhi Wang, Bin Ren, Shuaiwen Leon Song, Dingwen Tao

Moreover, compared with the state-of-the-art pruning-during-training approach, ClickTrain provides significant improvements both accuracy and compression ratio on the tested CNN models and datasets, under similar limited training time.

H-GCN: A Graph Convolutional Network Accelerator on Versal ACAP Architecture

no code implementations28 Jun 2022 Chengming Zhang, Tong Geng, Anqi Guo, Jiannan Tian, Martin Herbordt, Ang Li, Dingwen Tao

Graph Neural Networks (GNNs) have drawn tremendous attention due to their unique capability to extend Machine Learning (ML) approaches to applications broadly-defined as having unstructured data, especially graphs.

BIG-bench Machine Learning

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

no code implementations1 Nov 2022 Baixi Sun, Xiaodong Yu, Chengming Zhang, Jiannan Tian, Sian Jin, Kamil Iskra, Tao Zhou, Tekin Bicer, Pete Beckman, Dingwen Tao

Our evaluation with three scientific surrogates and 32 GPUs illustrates that SOLAR can achieve up to 24. 4X speedup over PyTorch Data Loader and 3. 52X speedup over state-of-the-art data loaders.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.