Search Results for author: Kaijun Zheng

Found 1 papers, 0 papers with code

Jetfire: Efficient and Accurate Transformer Pretraining with INT8 Data Flow and Per-Block Quantization

no code implementations • 19 Mar 2024 • Haocheng Xi, Yuxiang Chen, Kang Zhao, Kaijun Zheng, Jianfei Chen, Jun Zhu

Moreover, for a standard transformer block, our method offers an end-to-end training speedup of 1. 42x and a 1. 49x memory reduction compared to the FP16 baseline.

Quantization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.