Whole slide pathological image classification of breast cancer based on mixed supervision learning
Objective Breast cancer belongs to the most common malignant tumors among women,and its early diagnosis and accurate classification bear great importance.Breast cancer whole slide pathological images serve as important auxil-iary diagnostic means,and their classification can assist doctors in the accurate identification of tumor types.However,given the complexity and huge data volume of breast cancer whole slide pathological images,manual annotation of the label of each image becomes time consuming and labor intensive.Therefore,researchers have proposed various automated meth-ods to address the issue encountered in the classification of breast cancer whole slide pathological images.Self-and weakly supervised learning effectively tackling the challenge of breast cancer whole slide pathological image classification.Self-supervised learning is a type of machine learning method that skips the manual annotation of labels.This method design tasks that enable the model to learn feature representations from unlabeled data.Self-supervised learning has achieved remarkable progress in the field of computer vision,but it still faces certain challenges in breast cancer whole slide patho-logical image classification.Given the complexity and diversity of pathological images,relying solely on the pseudo labels generated by self-supervised learning may fail to accurately reflect the true classification information,which affects the classi-fication performance.On the other hand,weakly supervised learning leverages information from unlabeled image data through various methods,such as multiple instance learning or label propagation.However,the associated models encoun-ter challenges,such as limited label information and noise,which affect the model's stability during the learning process and thus the stability of prediction results.To overcome the limitations of self-and weakly supervised learning,this paper proposes a mixed supervised learning method for breast cancer whole slide pathological image classification.The integra-tion of MoBY self-supervised contrastive learning with weakly supervised multi-instance learning combines the advantages of these learning architectures and makes full use of unlabeled and noisy labeled data.In addition,such combination improves the classification performance through feature selection and spatial correlation enhancement,which results in increased robustness.Method First,the self-supervised MoBY was used to train the model on unlabeled pathological image data.MoBY,can learn key feature representations from images,is a self-supervised learning method based on self-reconstruction and contrastive learning.This process enables the model to extract useful feature information from unlabeled data and provide better initialization parameters for subsequent classification tasks.Then,a weakly supervised learning approach based on multiple instance learning was used for further model optimization.Multiple instance learning utilizes information from unlabeled image data for model training.In breast cancer whole slide pathological image classification,the accurate annotation of each image category often presents a challenge.This type of learning divides images into positive and negative instances based on instance-level labels to train the model.This approach partially contributes to solving the problem of limited label information and improves a model's robustness and generalization capability.For the feature selec-tion stage,representative feature vectors were selected from each whole slide image to reduce redundancy and noise,extract the most informative features,and improve the model's focus and discriminative capability toward key regions.In addition,the paper leverages a Transformer encoder to improve the correlation among various image patches.The Trans-former encoder is a powerful tool for modeling global contextual information in images,and it captures semantic relation-ships between different regions of an image to further increase the classification accuracy.The introduction of the Trans-former encoder into breast cancer whole slide pathological image classification enables the improved utilization of global image information and further understanding of a model's image structure and context.Comprehensive application of meth-ods,such as self-and weakly supervised learning,resulted in the high accuracy and robustness of the proposed mixed supervised learning approach for the classification of breast cancer whole slide pathological images in this paper.In experi-ments,this method achieved excellent classification results on a dataset of breast cancer whole slide pathological images.This approach serves as a powerful tool and technical support for the early diagnosis and accurate classification of breast cancer.Result The effectiveness of the mixed supervised model was validated through evaluation experiments conducted on the publicly available Camelyon-16 breast-cancer pathological image dataset.Compared with the state-of-the-art weakly and self-supervised models of this dataset,the proposed model achieved evident improvements of 2.34%and 2.74%in the area under the receiver operating characteristic,respectively.This finding indicates that the proposed method outper-formed the other models in terms of breast cancer whole slide pathological image classification tasks.To further validate its generalization capability,we performed experiments on an external validation dataset of MSK.The proposed model for this validation dataset demonstrated a great performance improvement of 6.26%,which further confirms its strong generaliza-tion capability and practicality.Conclusion The proposed breast cancer whole slide pathological image classification method based on mixed supervision achieved remarkable results in addressing the related challenge By leveraging the advantages of self-supervised learning,weakly supervised learning,and spatial correlation enhancement,the given model demonstrated improved classification performance on public and external validation datasets.This method exhibits a good generalization capability and offers a viable solution for the early diagnosis and treatment of breast cancer.Future research should further refine and optimize the proposed method to increase its accuracy and robustness in breast cancer whole slide pathology image classification.This paper will address the challenges in breast cancer pathological image classification and contribute to the development of early breast cancer diagnosis and treatment.
breast cancer whole slide pathology imageclassificationmixed supervised learningfeature fusionTrans-former