Search Results for author: Nilesh Prasad Pandey

Found 4 papers, 0 papers with code

Softmax Bias Correction for Quantized Generative Models

no code implementations4 Sep 2023 Nilesh Prasad Pandey, Marios Fournarakis, Chirag Patel, Markus Nagel

Post-training quantization (PTQ) is the go-to compression technique for large generative models, such as stable diffusion or large language models.

Language Modelling Quantization

A Practical Mixed Precision Algorithm for Post-Training Quantization

no code implementations10 Feb 2023 Nilesh Prasad Pandey, Markus Nagel, Mart van Baalen, Yin Huang, Chirag Patel, Tijmen Blankevoort

We experimentally validate our proposed method on several computer vision tasks, natural language processing tasks and many different networks, and show that we can find mixed precision networks that provide a better trade-off between accuracy and efficiency than their homogeneous bit-width equivalents.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.