no code implementations • 19 Sep 2022 • Rong Tian, Zijing Zhao, Weijie Liu, Haoyan Liu, Weiquan Mao, Zhe Zhao, Kan Zhou
The latest industrial inference engines, such as FasterTransformer and TurboTransformers, have verified that half-precision floating point (FP16) and 8-bit integer (INT8) quantization can greatly improve model inference speed.
no code implementations • 18 May 2014 • Tao Ye, Kan Zhou, Zhipeng Lu, Jin-Kao Hao
This paper introduces an effective memetic algorithm for the linear ordering problem with cumulative costs.