Model Compression

343 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,281

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,309

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Latest papers

Most implemented Social Latest No code

Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design

seolabcornell/torch2chip • • 2 May 2024

The limited degree of freedom in the current toolkit and the under-explored customization hinder the prototype ASIC or FPGA-based accelerator design.

02 May 2024

Paper
Code

Data-free Knowledge Distillation for Fine-grained Visual Categorization

roryshao/dfkd-fgvc • • ICCV 2023

Our approach utilizes an adversarial distillation framework with attention generator, mixed high-order attention distillation, and semantic feature contrast learning.

18 Apr 2024

Paper
Code

Transferable and Principled Efficiency for Open-Vocabulary Segmentation

faceonlive/ai-research • 11 Apr 2024

In the context of efficient OVS, we target achieving performance that is comparable to or even better than prior OVS works based on large vision-language foundation models, by utilizing smaller models that incur lower training costs.

208

11 Apr 2024

Paper
Code

Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind

faceonlive/ai-research • 6 Apr 2024

MBS overcomes the English-centric limitations of existing methods by sampling calibration data from various languages proportionally to the language distribution of the model training datasets.

208

06 Apr 2024

Paper
Code

Are Compressed Language Models Less Subgroup Robust?

wearepal/compression-subgroup • • 26 Mar 2024

To reduce the inference cost of large language models, model compression is increasingly used to create smaller scalable models.

26 Mar 2024

Paper
Code

Tiny Models are the Computational Saver for Large Models

QingyuanWang/tinysaver • • 26 Mar 2024

By searching and employing the most appropriate tiny model as the computational saver for a given large model, the proposed approaches work as a novel and generic method to model compression.

26 Mar 2024

Paper
Code

Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

saintslab/pepr • • 14 Mar 2024

We present experiments on two benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.

14 Mar 2024

Paper
Code

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

aiot-mlsys-lab/svd-llm • • 12 Mar 2024

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitate LLM compression methods for practical deployment.

12 Mar 2024

Paper
Code

Bit-mask Robust Contrastive Knowledge Distillation for Unsupervised Semantic Hashing

hly1998/brcd • • 10 Mar 2024

In this paper, we propose an innovative Bit-mask Robust Contrastive knowledge Distillation (BRCD) method, specifically devised for the distillation of semantic hashing models.

10 Mar 2024

Paper
Code

DyCE: Dynamic Configurable Exiting for Deep Learning Compression and Scaling

QingyuanWang/dyce • • 4 Mar 2024

Moreover, most current dynamic compression designs are monolithic and tightly integrated with base models, thereby complicating the adaptation to novel base models.

04 Mar 2024

Paper
Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result