Model Compression

342 papers with code • 2 benchmarks • 4 datasets

Model Compression is an actively pursued area of research over the last few years with the goal of deploying state-of-the-art deep networks in low-power and resource limited devices without significant drop in accuracy. Parameter pruning, low-rank factorization and weight quantization are some of the proposed methods to compress the size of deep networks.

Source: KD-MRI: A knowledge distillation framework for image reconstruction and image restoration in MRI workflow

Benchmarks

Add a Result

These leaderboards are used to track progress in Model Compression

Trend	Dataset	Best Model	Paper	Code	Compare
	ImageNet	ADLIK-MO-ResNet50+W4A4			See all
	QNLI	MobileBERT + 2bit-1dim model compression using DKM			See all

Libraries

Use these libraries to find Model Compression models and implementations

yoshitomo-matsubara/torchdistill

6 papers

1,272

UCMerced-ML/LC-model-compression

5 papers

yeshaokai/Robustness-Aware-Pruning-…

3 papers

NervanaSystems/distiller

2 papers

4,304

See all 8 libraries.

Datasets

Subtasks

Neural Network Compression

Latest papers

Most implemented Social Latest No code

PromptMM: Multi-Modal Knowledge Distillation for Recommendation with Prompt-Tuning

hkuds/promptmm • • 27 Feb 2024

Additionally, to adjust the impact of inaccuracies in multimedia data, a disentangled multi-modal list-wise distillation is developed with modality-aware re-weighting mechanism.

27 Feb 2024

Paper
Code

LLM Inference Unveiled: Survey and Roofline Model Insights

hahnyuan/llmviewer • 26 Feb 2024

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

162

26 Feb 2024

Paper
Code

A Survey on Knowledge Distillation of Large Language Models

tebmer/awesome-knowledge-distillation-of-llms • • 20 Feb 2024

In the era of Large Language Models (LLMs), Knowledge Distillation (KD) emerges as a pivotal methodology for transferring advanced capabilities from leading proprietary LLMs, such as GPT-4, to their open-source counterparts like LLaMA and Mistral.

234

20 Feb 2024

Paper
Code

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

hatchetProject/QuEST • • 6 Feb 2024

Diffusion models have achieved remarkable success in image generation tasks, yet their practical deployment is restrained by the high memory and time consumption.

06 Feb 2024

Paper
Code

The Potential of AutoML for Recommender Systems

isg-siegen/automl_for_recommender_systems • 6 Feb 2024

We found that AutoML and AutoRecSys libraries performed best.

06 Feb 2024

Paper
Code

Faster and Lighter LLMs: A Survey on Current Challenges and Way Forward

nyunai/faster-llm-survey • • 2 Feb 2024

Despite the impressive performance of LLMs, their widespread adoption faces challenges due to substantial computational and memory requirements during inference.

02 Feb 2024

Paper
Code

LiDAR-PTQ: Post-Training Quantization for Point Cloud 3D Object Detection

stiphyjay/lidar-ptq • 29 Jan 2024

To our knowledge, for the very first time in lidar-based 3D detection tasks, the PTQ INT8 model's accuracy is almost the same as the FP32 model while enjoying $3\times$ inference speedup.

29 Jan 2024

Paper
Code

TQCompressor: improving tensor decomposition methods in neural networks via permutations

terra-quantum-public/tqcompressedgpt2 • • 29 Jan 2024

The result of the compression is TQCompressedGPT-2 model, featuring 81 mln.

29 Jan 2024

Paper
Code

Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation

FederatedML/FedCompress • • 25 Jan 2024

Federated Learning (FL) is a promising technique for the collaborative training of deep neural networks across multiple devices while preserving data privacy.

25 Jan 2024

Paper
Code

Model Compression Techniques in Biometrics Applications: A Survey

eduardacaldeira/compression_bias_survey • 18 Jan 2024

The development of deep learning algorithms has extensively empowered humanity's task automatization capacity.

18 Jan 2024

Paper
Code

Model Compression

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result