Search Results for author: Chiachen Chou

Found 3 papers, 0 papers with code

Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

no code implementations8 Jun 2023 Clemens JS Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang

To address this challenge, we propose a mixed-precision post training quantization (PTQ) approach that assigns different numerical precisions to tensors in a network based on their specific needs, for a reduced memory footprint and improved latency while preserving model accuracy.

Quantization

Mixed Precision Post Training Quantization of Neural Networks with Sensitivity Guided Search

no code implementations2 Feb 2023 Clemens JS Schaefer, Elfie Guo, Caitlin Stanton, Xiaofan Zhang, Tom Jablin, Navid Lambert-Shirzad, Jian Li, Chiachen Chou, Siddharth Joshi, Yu Emma Wang

In this paper, we propose a method to efficiently determine quantization configurations of different tensors in ML models using post-training mixed precision quantization.

Quantization

Scale MLPerf-0.6 models on Google TPU-v3 Pods

no code implementations21 Sep 2019 Sameer Kumar, Victor Bitorff, Dehao Chen, Chiachen Chou, Blake Hechtman, HyoukJoong Lee, Naveen Kumar, Peter Mattson, Shibo Wang, Tao Wang, Yuanzhong Xu, Zongwei Zhou

The recent submission of Google TPU-v3 Pods to the industry wide MLPerf v0. 6 training benchmark demonstrates the scalability of a suite of industry relevant ML models.

Benchmarking

Cannot find the paper you are looking for? You can Submit a new open access paper.