Browse > Methodology > Quantization

Quantization

116 papers with code · Methodology

State-of-the-art leaderboards

Trend Dataset Best Method Paper title Paper Code Compare

Latest papers without code

Smart Ternary Quantization

ICLR 2020

Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements.

IMAGE CLASSIFICATION QUANTIZATION

Goten: GPU-Outsourcing Trusted Execution of Neural Network Training and Prediction

ICLR 2020

Before we saw worldwide collaborative efforts in training machine-learning models or widespread deployments of prediction-as-a-service, we need to devise an efficient privacy-preserving mechanism which guarantees the privacy of all stakeholders (data contributors, model owner, and queriers).

QUANTIZATION

Ternary MobileNets via Per-Layer Hybrid Filter Banks

ICLR 2020

Using this proposed quantization method, we quantized a substantial portion of weight filters of MobileNets to ternary values resulting in 27. 98% savings in energy, and a 51. 07% reduction in the model size, while achieving comparable accuracy and no degradation in throughput on specialized hardware in comparison to the baseline full-precision MobileNets.

QUANTIZATION

Mixed Precision DNNs: All you need is a good parametrization

ICLR 2020

Since choosing the optimal bitwidths is not straight forward, training methods, which can learn them, are desirable.

QUANTIZATION

Compression without Quantization

ICLR 2020

Standard compression algorithms work by mapping an image to discrete code using an encoder from which the original image can be reconstructed through a decoder.

IMAGE COMPRESSION QUANTIZATION

Learning Compact Embedding Layers via Differentiable Product Quantization

ICLR 2020

Embedding layers are commonly used to map discrete symbols into continuous embedding vectors that reflect their semantic meanings.

QUANTIZATION

Mixed Precision Training With 8-bit Floating Point

ICLR 2020

Reduced precision computation is one of the key areas addressing the widening’compute gap’, driven by an exponential growth in deep learning applications.

QUANTIZATION

GQ-Net: Training Quantization-Friendly Deep Networks

ICLR 2020

Network quantization is a model compression and acceleration technique that has become essential to neural network deployment.

MODEL COMPRESSION QUANTIZATION

Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

ICLR 2020

As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed on clusters to perform model fitting in parallel.

QUANTIZATION

Fix-Net: pure fixed-point representation of deep neural networks

ICLR 2020

Deep neural networks (DNNs) dominate current research in machine learning.

QUANTIZATION