Self-supervised extraction of spectral sequence and semantic information for microscopic cholangiocarcinoma hyperspectral image classification
Objective Cholangiocarcinoma is a type of cancer with high fatality rate,and the early detection and treatment of cancer can significantly reduce its incidence.Digital diagnosis of pathological sections can effectively improve the accu-racy and efficiency of cancer diagnosis.Microscopic hyperspectral images of pathological sections contain richer spectral information than color images.Due to the specific spectral response of biological tissues,pathological tissues have different spectral characteristics from normal tissues,and meaningful and rich spectral information provides great potential for the classification of cancer cells and healthy cells.The performance of most pathologic hyperspectral image classification algo-rithms is highly dependent on high-quality labeled datasets,but pathologic hyperspectral images need to be manually labeled by experienced pathologists,which can be time consuming and laborious.The feature extraction algorithm based on self-supervision initially extracts features from unlabeled image data in an unsupervised way by designing pretext tasks and then transfers these image data to downstream tasks.After fine-tuning the downstream task network with a limited num-ber of labeled samples,these algorithms can achieve a supervised learning performance and alleviate the data annotation problem.However,traditional contrast self-supervised learning shows limitations in extracting high-level semantic informa-tion,and an image enhancement method specific to pathological hyperspectral images is not yet available.Therefore,this paper proposes a self-supervised method to extract sequential spectral data and semantic information from hyperspectral images of cholangiocarcinoma and improve the feature extraction capability and classification accuracy of the self-supervised method.Method Hyperspectral images are different from natural images in that image enhancement tech-niques,such as color transformation,can change spectral information.It is meaningful to use the encoder structure as an image enhancement method for hyperspectral images.However,the encoder used in existing methods is based on the con-volutional neural network(CNN),and the feature map extracted by the CNN corresponds to the local receptive field and ignores the global information of the spectral dimension.Given the limited ability of CNN in characterizing spectral sequence data,this paper first designs a Transformer encoder structure for image enhancement,which retains the details of the sequence in the original image.Borrowing from natural language processing,the Transformer architecture with sequen-tial information modeling capability takes the spectral curve reflected by each pixel of the hyperspectral image as a spectral sequence.Transformer then uses position embedding and attention module to pay attention to the differences among spec-tral sequences and to efficiently learn spectral sequence information.Second,after the image is enhanced with a Trans-former encoder structure to obtain positive samples,the convolutional autoencoder can be used as another set of image enhancement to obtain negative samples required for contrastive learning.To address the problem where traditional contras-tive learning extracts features through low-level image differences,thus resulting in its limited ability to extract advanced semantic information,this paper applies prototypical contrastive learning to extract features from pathological hyperspectral images.Positive and negative samples are trained through the clustering and instance discrimination tasks of a prototypical contrastive learning network to learn advanced semantic information in images.The above process of extracting features from network structures uses unlabeled data.Finally,the classification results are obtained by fine-tuning the downstream classification task network with a few labeled features.Result Experiments were conducted on eight scenes in the hyper-spectral dataset of multidimensional cholangiocarcinoma pathology.These scenes were selected from eight patients.To ensure the representativeness of these scenes,different cancer cell morphology,cancer cell proportion,and spectral response curve were used in each scene.The proportion of cancer regions in scenes 2,3,and 8 only accounted for 1/8 of the whole picture.Experiments were conducted on each scenario,where 5%of the data was labeled for training and 95%was used for testing.To verify the effectiveness of the proposed self-supervised method proposed on pathological hyperspec-tral datasets,this method was compared with 12 widely used algorithms and networks,including 7 supervised feature extraction methods and 5 unsupervised feature extraction algorithms.Experimental results show that the features extracted by the proposed method achieve optimal results in downstream classification tasks,with an average overall accuracy of 96.63%,average accuracy of reaching 95.37%,and average Kappa coefficient of 0.91.Ablation experiments were also conducted to verify that compared with the convolutional module,the Transformer module pays more attention to sequence details when extracting features after adding the self-attention mechanism and multi-head attention mechanism,which can effectively retain original image information and achieve high classification accuracy.The prototypical contrastive learning module adds a clustering process on the basis of contrastive learning and achieves high classification accuracy,thereby proving that the prototypical contrastive learning module can effectively extract high-level semantic information from micro-scopic hyperspectral images of cholangiocarcinoma.Results of the dimensionality reduction experiment also show that the semantic features extracted by the proposed method are linearly separable.Conclusion The proposed method can extract effective features from unlabeled hyperspectral images of cholangiocarcinoma,and these features can be applied to classifi-cation tasks to achieve high classification accuracy and alleviate the problem of pathological hyperspectral image data label-ing.This method carries certain research value and practical significance for the medical diagnosis of cholangiocarcinoma.
cancer classificationhyperspectral imagesdeep learningself-supervised learningimage enhancement