Performance Comparison of Balanced and Unbalanced Cancer Datasets using Pre-Trained Convolutional Neural Network

10 Dec 2020  ·  Ali Narin ·

Cancer disease is one of the leading causes of death all over the world. Breast cancer, which is a common cancer disease especially in women, is quite common. The most important tool used for early detection of this cancer type, which requires a long process to establish a definitive diagnosis, is histopathological images taken by biopsy. These obtained images are examined by pathologists and a definitive diagnosis is made. It is quite common to detect this process with the help of a computer. Detection of benign or malignant tumors, especially by using data with different magnification rates, takes place in the literature. In this study, two different balanced and unbalanced study groups have been formed by using the histopathological data in the BreakHis data set. We have examined how the performances of balanced and unbalanced data sets change in detecting tumor type. In conclusion, in the study performed using the InceptionV3 convolution neural network model, 93.55% accuracy, 99.19% recall and 87.10% specificity values have been obtained for balanced data, while 89.75% accuracy, 82.89% recall and 91.51% specificity values have been obtained for unbalanced data. According to the results obtained in two different studies, the balance of the data increases the overall performance as well as the detection performance of both benign and malignant tumors. It can be said that the model trained with the help of data sets created in a balanced way will give pathology specialists higher and accurate results.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods