Dataset Condensation

24 papers with code • 0 benchmarks • 0 datasets

Condense the full dataset into a tiny set of synthetic data.

Most implemented papers

Efficient Dataset Distillation Using Random Feature Approximation

yolky/rfad 21 Oct 2022

Dataset distillation compresses large datasets into smaller synthetic coresets which retain performance with the aim of reducing the storage and computational burden of processing the entire dataset.

Improved Distribution Matching for Dataset Condensation

uitrbn/idm CVPR 2023

In this paper, we propose a novel dataset condensation method based on distribution matching, which is more efficient and promising.

Privacy for Free: How does Dataset Condensation Help Privacy?

Guang000/Awesome-Dataset-Distillation 1 Jun 2022

In this work, we for the first time identify that dataset condensation (DC) which is originally designed for improving training efficiency is also a better solution to replace the traditional data generators for private data generation, thus providing privacy for free.

Delving into Effective Gradient Matching for Dataset Condensation

Guang000/Awesome-Dataset-Distillation 30 Jul 2022

In this work, we delve into the gradient matching method from a comprehensive perspective and answer the critical questions of what, how, and where to match.

Dataset Condensation with Latent Space Knowledge Factorization and Sharing

Guang000/Awesome-Dataset-Distillation 21 Aug 2022

In this paper, we introduce a novel approach for systematically solving dataset condensation problem in an efficient manner by exploiting the regularity in a given dataset.

Dataset Distillation: A Comprehensive Review

Guang000/Awesome-Dataset-Distillation 17 Jan 2023

Recent success of deep learning is largely attributed to the sheer amount of data used for training deep neural networks. Despite the unprecedented success, the massive data, unfortunately, significantly increases the burden on storage and transmission and further gives rise to a cumbersome model training process.

Towards Efficient Deep Hashing Retrieval: Condensing Your Data via Feature-Embedding Matching

Guang000/Awesome-Dataset-Distillation 29 May 2023

The expenses involved in training state-of-the-art deep hashing retrieval models have witnessed an increase due to the adoption of more sophisticated models and large-scale datasets.

Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective

VILA-Lab/SRe2L NeurIPS 2023

The proposed method demonstrates flexibility across diverse dataset scales and exhibits multiple advantages in terms of arbitrary resolutions of synthesized images, low training cost and memory consumption with high-resolution synthesis, and the ability to scale up to arbitrary evaluation network architectures.

Fast Graph Condensation with Structure-based Neural Tangent Kernel

wanglin0126/gcsntk 17 Oct 2023

The rapid development of Internet technology has given rise to a vast amount of graph-structured data.

You Only Condense Once: Two Rules for Pruning Condensed Datasets

he-y/you-only-condense-once NeurIPS 2023

However, these scenarios have two significant challenges: 1) the varying computational resources available on the devices require a dataset size different from the pre-defined condensed dataset, and 2) the limited computational resources often preclude the possibility of conducting additional condensation processes.