Search Results for author: Size Zheng

Found 2 papers, 2 papers with code

Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

1 code implementation29 Oct 2023 Yilong Zhao, Chien-Yu Lin, Kan Zhu, Zihao Ye, Lequn Chen, Size Zheng, Luis Ceze, Arvind Krishnamurthy, Tianqi Chen, Baris Kasikci

To maximize LLMs' serving throughput, we introduce Atom, a low-bit quantization method that achieves high throughput improvements with negligible accuracy loss.

Quantization Sentiment Analysis

HASCO: Towards Agile HArdware and Software CO-design for Tensor Computation

1 code implementation4 May 2021 Qingcheng Xiao, Size Zheng, Bingzhe Wu, Pengcheng Xu, Xuehai Qian, Yun Liang

Second, the overall design space composed of HW/SW partitioning, hardware optimization, and software optimization is huge.

Bayesian Optimization Q-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.