Search Results for author: Chiachen Chou

Found 3 papers, 0 papers with code

Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

no code implementations • 8 Jun 2023 • Clemens JS Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang

To address this challenge, we propose a mixed-precision post training quantization (PTQ) approach that assigns different numerical precisions to tensors in a network based on their specific needs, for a reduced memory footprint and improved latency while preserving model accuracy.

Quantization

Paper
Add Code

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search

no code implementations • 2 Feb 2023 • Clemens JS Schaefer, Elfie Guo, Caitlin Stanton, Xiaofan Zhang, Tom Jablin, Navid Lambert-Shirzad, Jian Li, Chiachen Chou, Siddharth Joshi, Yu Emma Wang

In this paper, we propose a method to efficiently determine quantization configurations of different tensors in ML models using post-training mixed precision quantization.

Quantization

Paper
Add Code

Scale MLPerf-0.6 models on Google TPU-v3 Pods

no code implementations • 21 Sep 2019 • Sameer Kumar, Victor Bitorff, Dehao Chen, Chiachen Chou, Blake Hechtman, HyoukJoong Lee, Naveen Kumar, Peter Mattson, Shibo Wang, Tao Wang, Yuanzhong Xu, Zongwei Zhou

The recent submission of Google TPU-v3 Pods to the industry wide MLPerf v0. 6 training benchmark demonstrates the scalability of a suite of industry relevant ML models.

Benchmarking

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.