From Data Quality to Model Quality: an Exploratory Study on Deep Learning

10 Jun 2019  ·  Tianxing He, Shengcheng Yu, Ziyuan Wang, Jieqiong Li, Zhenyu Chen ·

Nowadays, people strive to improve the accuracy of deep learning models. However, very little work has focused on the quality of data sets. In fact, data quality determines model quality. Therefore, it is important for us to make research on how data quality affects on model quality. In this paper, we mainly consider four aspects of data quality, including Dataset Equilibrium, Dataset Size, Quality of Label, Dataset Contamination. We deign experiment on MNIST and Cifar-10 and try to find out the influence the four aspects make on model quality. Experimental results show that four aspects all have decisive impact on the quality of models. It means that decrease in data quality in these aspects will reduce the accuracy of model.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here