Quantization

1058 papers with code • 10 benchmarks • 18 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Libraries

Use these libraries to find Quantization models and implementations

Latest papers with no code

A SER-based Device Selection Mechanism in Multi-bits Quantization Federated Learning

no code yet • 20 Apr 2024

The quality of wireless communication will directly affect the performance of federated learning (FL), so this paper analyze the influence of wireless communication on FL through symbol error rate (SER).

EdgeFusion: On-Device Text-to-Image Generation

no code yet • 18 Apr 2024

The intensive computational burden of Stable Diffusion (SD) for text-to-image generation poses a significant hurdle for its practical application.

Privacy-Preserving UCB Decision Process Verification via zk-SNARKs

no code yet • 18 Apr 2024

With the increasingly widespread application of machine learning, how to strike a balance between protecting the privacy of data and algorithm parameters and ensuring the verifiability of machine learning has always been a challenge.

LongVQ: Long Sequence Modeling with Vector Quantization on Structured Memory

no code yet • 17 Apr 2024

Transformer models have been successful in various sequence processing tasks, but the self-attention mechanism's computational cost limits its practicality for long sequences.

Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

no code yet • 17 Apr 2024

Simulating the dynamics of open quantum systems coupled to non-Markovian environments remains an outstanding challenge due to exponentially scaling computational costs.

QGen: On the Ability to Generalize in Quantization Aware Training

no code yet • 17 Apr 2024

In this work, we investigate the generalization properties of quantized neural networks, a characteristic that has received little attention despite its implications on model performance.

Comprehensive Survey of Model Compression and Speed up for Vision Transformers

no code yet • 16 Apr 2024

Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks.

Tripod: Three Complementary Inductive Biases for Disentangled Representation Learning

no code yet • 16 Apr 2024

Inductive biases are crucial in disentangled representation learning for narrowing down an underspecified solution set.

Efficient and accurate neural field reconstruction using resistive memory

no code yet • 15 Apr 2024

The GE harnesses the intrinsic stochasticity of resistive memory for efficient input encoding, while the PE achieves precise weight mapping through a Hardware-Aware Quantization (HAQ) circuit.

TMPQ-DM: Joint Timestep Reduction and Quantization Precision Selection for Efficient Diffusion Models

no code yet • 15 Apr 2024

Diffusion models have emerged as preeminent contenders in the realm of generative models.