Self-supervised learning-based cervical cytology for the triage of HPV-positive women in resource-limited settings and low-data regime

Screening Papanicolaou test samples has proven to be highly effective in reducing cervical cancer-related mortality. However, the lack of trained cytopathologists hinders its widespread implementation in low-resource settings. Deep learning-based telecytology diagnosis emerges as an appealing alternative, but it requires the collection of large annotated training datasets, which is costly and time-consuming. In this paper, we demonstrate that the abundance of unlabeled images that can be extracted from Pap smear test whole slide images presents a fertile ground for self-supervised learning methods, yielding performance improvements relative to readily available pre-trained models for various downstream tasks. In particular, we propose \textbf{C}ervical \textbf{C}ell \textbf{C}opy-\textbf{P}asting ($\texttt{C}^{3}\texttt{P}$) as an effective augmentation method, which enables knowledge transfer from open-source and labeled single-cell datasets to unlabeled tiles. Not only does $\texttt{C}^{3}\texttt{P}$ outperforms naive transfer from single-cell images, but we also demonstrate its advantageous integration into multiple instance learning methods. Importantly, all our experiments are conducted on our introduced \textit{in-house} dataset comprising liquid-based cytology Pap smear images obtained using low-cost technologies. This aligns with our objective of leveraging deep learning-based telecytology for diagnosis in low-resource settings.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods