Search Results for author: Jicheng Wen

Found 1 papers, 1 papers with code

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

1 code implementation25 Sep 2024 Yifei Liu, Jicheng Wen, Yang Wang, Shengyu Ye, Li Lyna Zhang, Ting Cao, Cheng Li, Mao Yang

Due to the redundancy in LLM weights, recent research has focused on pushing weight-only quantization to extremely low-bit (even down to 2 bits).

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.