Image Dataset Compression Based on Matrix Product States
Large-scale datasets have produced impressive advances in machine learning. However, storing datasets and training neural network models on large datasets have become increasingly expensive. In this paper, we present an effective dataset compression approach based on the matrix product states (MPS) from quantum many-body physics. It can decompose an original image into a sequential product of tensors which effectively retain short-range correlation information in the data for training deep neural networks from scratch. Based on the MPS structure, we propose a new dataset compression method that compresses datasets by filtering long-range correlation information in task-agnostic scenarios and uses dataset distillation to supplement the information in task-specific scenarios. Our approach boosts the model performance by information supplementation and meanwhile maximizes useful information for the downstream task. Extensive experiments have demonstrated the effectiveness of the proposed approach in dataset compression, especially obtained better model performance (3.19$\%$ on average) than state-of-the-art methods for the same compression rate.
PDF Abstract