Image Dataset Compression Based on Matrix Product States

29 Sep 2021  ·  Ze-Feng Gao, Peiyu Liu, Xiao-Hui Zhang, Xin Zhao, Z. Y. Xie, Zhong-Yi Lu, Ji-Rong Wen ·

Large-scale datasets have produced impressive advances in machine learning. However, storing datasets and training neural network models on large datasets have become increasingly expensive. In this paper, we present an effective dataset compression approach based on the matrix product states (MPS) from quantum many-body physics. It can decompose an original image into a sequential product of tensors which effectively retain short-range correlation information in the data for training deep neural networks from scratch. Based on the MPS structure, we propose a new dataset compression method that compresses datasets by filtering long-range correlation information in task-agnostic scenarios and uses dataset distillation to supplement the information in task-specific scenarios. Our approach boosts the model performance by information supplementation and meanwhile maximizes useful information for the downstream task. Extensive experiments have demonstrated the effectiveness of the proposed approach in dataset compression, especially obtained better model performance (3.19$\%$ on average) than state-of-the-art methods for the same compression rate.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here