Dataset Distillation - 1IPC
11 papers with code • 4 benchmarks • 3 datasets
Dataset distillation aims to compress a dataset into a much smaller one so that a model trained on the distilled dataset achieves high accuracy. Concretely, for 1-IPC, it is framed as maximizing the distilled classification accuracy for a budget of 1 distilled images-per-class.
Libraries
Use these libraries to find Dataset Distillation - 1IPC models and implementationsMost implemented papers
Dataset Distillation by Matching Training Trajectories
To efficiently obtain the initial and target network parameters for large-scale datasets, we pre-compute and store training trajectories of expert networks trained on the real dataset.
Dataset Condensation with Gradient Matching
As the state-of-the-art machine learning methods in many fields rely on larger datasets, storing datasets and training models on them become significantly more expensive.
Dataset Condensation with Distribution Matching
Computational cost of training state-of-the-art deep models in many learning problems is rapidly increasing due to more sophisticated models and larger datasets.
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation
To mitigate the adverse impact of this accumulated trajectory error, we propose a novel approach that encourages the optimization algorithm to seek a flat trajectory.
Dataset Condensation with Differentiable Siamese Augmentation
In many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load.
Dataset Distillation using Neural Feature Regression
Dataset distillation can be formulated as a bi-level meta-learning problem where the outer loop optimizes the meta-dataset and the inner loop trains a model on the distilled data.
Remember the Past: Distilling Datasets into Addressable Memories for Neural Networks
We propose an algorithm that compresses the critical information of a large dataset into compact addressable memories.
Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory
The resulting algorithm sets new SOTA on ImageNet-1K: we can scale up to 50 IPCs (Image Per Class) on ImageNet-1K on a single GPU (all previous methods can only scale to 2 IPCs on ImageNet-1K), leading to the best accuracy (only 5. 9% accuracy drop against full dataset training) while utilizing only 4. 2% of the number of data points - an 18. 2% absolute gain over prior SOTA.
Dataset Distillation with Convexified Implicit Gradients
We propose a new dataset distillation algorithm using reparameterization and convexification of implicit gradients (RCIG), that substantially improves the state-of-the-art.
Embarassingly Simple Dataset Distillation
Re-examining the foundational back-propagation through time method, we study the pronounced variance in the gradients, computational burden, and long-term dependencies.