Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,262

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,302

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Latest papers with no code

Most implemented Social Latest No code

Comprehensive Survey of Model Compression and Speed up for Vision Transformers

no code yet • 16 Apr 2024

Vision Transformers (ViT) have marked a paradigm shift in computer vision, outperforming state-of-the-art models across diverse tasks.

Paper
Add Code

Structured Model Pruning for Efficient Inference in Computational Pathology

no code yet • 12 Apr 2024

In this work, we demonstrate that model pruning, as a model compression technique, can effectively reduce inference cost for computational and digital pathology based analysis with a negligible loss of analysis performance.

Paper
Add Code

Bayesian Federated Model Compression for Communication and Computation Efficiency

no code yet • 11 Apr 2024

We propose a decentralized Turbo variational Bayesian inference (D-Turbo-VBI) FL framework where we firstly propose a hierarchical sparse prior to promote a clustered sparse structure in the weight matrix.

Paper
Add Code

Simplifying Two-Stage Detectors for On-Device Inference in Remote Sensing

no code yet • 11 Apr 2024

For on-device object detection, researches have been conducted on designing efficient detectors or model compression to reduce inference latency.

Paper
Add Code

On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQL

no code yet • 3 Apr 2024

Structured data, prevalent in tables, databases, and knowledge graphs, poses a significant challenge in its representation.

Paper
Add Code

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

no code yet • 3 Apr 2024

Knowledge distillation (KD) is a promising yet challenging model compression technique that transfers rich learning representations from a well-performing but cumbersome teacher model to a compact student model.

Paper
Add Code

Automated Inference of Graph Transformation Rules

no code yet • 3 Apr 2024

The explosion of data available in life sciences is fueling an increasing demand for expressive models and computational methods.

Paper
Add Code

Improve Knowledge Distillation via Label Revision and Data Selection

no code yet • 3 Apr 2024

In addition to the supervision of ground truth, the vanilla KD method regards the predictions of the teacher as soft labels to supervise the training of the student model.

Paper
Add Code

Enhancing Inference Efficiency of Large Language Models: Investigating Optimization Strategies and Architectural Innovations

no code yet • 2 Apr 2024

Therefore model compression is important, to retain the performance of larger models, but with a reduced cost of running them.

Paper
Add Code

Instance-Aware Group Quantization for Vision Transformers

no code yet • 1 Apr 2024

In particular, the distribution of activations for each channel vary drastically according to input instances, making PTQ methods for CNNs inappropriate for ViTs.

Paper
Add Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result