Search Results for author: Wanzin Yazar

Found 4 papers, 0 papers with code

Understanding the Difficulty of Low-Precision Post-Training Quantization for LLMs

no code implementations18 Oct 2024 Zifei Xu, Sayeh Sharify, Wanzin Yazar, Tristan Webb, Xin Wang

Large language models of high parameter counts are computationally expensive, yet can be made much more efficient by compressing their weights to very low numerical precision.

Quantization

Scaling Laws for Post Training Quantized Large Language Models

no code implementations15 Oct 2024 Zifei Xu, Alexander Lan, Wanzin Yazar, Tristan Webb, Sayeh Sharify, Xin Wang

Generalization abilities of well-trained large language models (LLMs) are known to scale predictably as a function of model size.

Quantization

Post Training Quantization of Large Language Models with Microscaling Formats

no code implementations12 May 2024 Sayeh Sharify, Utkarsh Saxena, Zifei Xu, Wanzin Yazar, Ilya Soloveychik, Xin Wang

Large Language Models (LLMs) have distinguished themselves with outstanding performance in complex language modeling tasks, yet they come with significant computational and storage challenges.

Language Modeling Language Modelling +1

Cannot find the paper you are looking for? You can Submit a new open access paper.