Data Labeling Impact on Deep Learning Models in Digital Pathology: a Breast Cancer Case Study

Image data labeling is a vital step for deep learning model training. Studies on data labeling have not considered its impact on model performance and only focused on problems such as the curse of big data labeling or labeling tools. Furthermore, it seems clear that errors in labeling have a significant impact and should be fixed. However, in the medical domain, it is hard to ensure proper data labeling. In general, trained engineers are asked to annotate histology images, which causes errors in labeling. The aim of this study is to highlight the impact of data labeling on deep learning models. For that purpose, deep learning models are trained on two different annotations with different levels of expertise. Results show the importance of including expertise in deep learning model development. The impact of data labeling is shown through a case study on the proliferation of biomarker Ki-67 labeling index scoring.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here