TXL-PBC: a freely accessible labeled peripheral blood cell dataset

18 Jul 2024  ·  Lu Gan, Xi Li ·

In a recent study, we found that publicly BCCD and BCD datasets have significant issues such as labeling errors, insufficient sample size, and poor data quality. To address these problems, we performed sample deletion, re-labeling, and integration of these two datasets. Additionally, we introduced the PBC and Raabin-WBC datasets, and ultimately created a high-quality, sample-balanced new dataset, which we named TXL-PBC. The dataset contains 1008 training sets, 288 validation sets, and 144 test sets. Firstly, The dataset underwent strict manual annotation, automatic annotation with YOLOv8n model, and manual audit steps to ensure the accuracy and consistency of annotations. Secondly, we addresses the blood cell mislabeling problem of the original datasets. The distribution of label boundary box areas and the number of labels are better than the BCCD and BCD datasets. Moreover, we used the YOLOv8n model to train these three datasets, the performance of the TXL-PBC dataset surpass the original two datasets. Finally, we employed YOLOv5n, YOLOv5s, YOLOv5l, YOLOv8s, YOLOv8m detection models as the baseline models for TXL-PBC. This study not only enhances the quality of the blood cell dataset but also supports researchers in improving models for blood cell target detection. We published our freely accessible TXL-PBC dataset at https://github.com/lugan113/TXL-PBC\_Dataset.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
2D Object Detection TXL-PBC: a freely accessible labeled peripheral blood cell dataset yolov5n mAP50 0.958 # 1
2D Object Detection TXL-PBC: a freely accessible labeled peripheral blood cell dataset yolov5s mAP50 0.97 # 2
2D Object Detection TXL-PBC: a freely accessible labeled peripheral blood cell dataset yolov8m mAP50 0.974 # 4
2D Object Detection TXL-PBC: a freely accessible labeled peripheral blood cell dataset yolov8s mAP50 0.977 # 5
2D Object Detection TXL-PBC: a freely accessible labeled peripheral blood cell dataset yolov8n mAP50 0.97 # 2

Methods


No methods listed for this paper. Add relevant methods here