Quantization

1000 papers with code • 9 benchmarks • 17 datasets

Quantization is a promising technique to reduce the computation cost of neural network training, which can replace high-cost floating-point numbers (e.g., float32) with low-cost fixed-point numbers (e.g., int8/int16).

Source: Adaptive Precision Training: Quantify Back Propagation in Neural Networks with Fixed-point Numbers

Libraries

Use these libraries to find Quantization models and implementations

Latest papers with no code

Super-High-Fidelity Image Compression via Hierarchical-ROI and Adaptive Quantization

no code yet • 19 Mar 2024

MSE-based models aim to improve objective metrics while generative models are leveraged to improve visual quality measured by subjective metrics.

Hierarchical Frequency-based Upsampling and Refining for Compressed Video Quality Enhancement

no code yet • 18 Mar 2024

The goal of video quality enhancement is to reduce compression artifacts and reconstruct a visually-pleasant result.

Spatio-Temporal Fluid Dynamics Modeling via Physical-Awareness and Parameter Diffusion Guidance

no code yet • 18 Mar 2024

This paper proposes a two-stage framework named ST-PAD for spatio-temporal fluid dynamics modeling in the field of earth sciences, aiming to achieve high-precision simulation and prediction of fluid dynamics through spatio-temporal physics awareness and parameter diffusion guidance.

HyperVQ: MLR-based Vector Quantization in Hyperbolic Space

no code yet • 18 Mar 2024

However, since the VQVAE is trained with a reconstruction objective, there is no constraint for the embeddings to be well disentangled, a crucial aspect for using them in discriminative tasks.

Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

no code yet • 18 Mar 2024

While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected.

Quantization Effects on Neural Networks Perception: How would quantization change the perceptual field of vision models?

no code yet • 15 Mar 2024

Neural network quantization is an essential technique for deploying models on resource-constrained devices.

Quantization Avoids Saddle Points in Distributed Optimization

no code yet • 15 Mar 2024

More specifically, we propose a stochastic quantization scheme and prove that it can effectively escape saddle points and ensure convergence to a second-order stationary point in distributed nonconvex optimization.

UniCode: Learning a Unified Codebook for Multimodal Large Language Models

no code yet • 14 Mar 2024

In this paper, we propose \textbf{UniCode}, a novel approach within the domain of multimodal large language models (MLLMs) that learns a unified codebook to efficiently tokenize visual, text, and potentially other types of signals.

Generalized Relevance Learning Grassmann Quantization

no code yet • 14 Mar 2024

The proposed model returns a set of prototype subspaces and a relevance vector.

FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models

no code yet • 14 Mar 2024

Federated Learning (FL) has garnered increasing attention due to its unique characteristic of allowing heterogeneous clients to process their private data locally and interact with a central server, while being respectful of privacy.