TransAS-UNet:regional segmentation of breast cancer Swin Transformer and of UNet algorithm
Objective Breast cancer is a serious and high-morbidity disease in women.Early detection of breast cancer is an important problem that needs to be solved all over the world.The current diagnostic methods for breast cancer include clinical,imaging,and histopathological examinations.The commonly used methods in imaging examination are X-ray,computed tomography(CT),and magnetic resonance imaging.etc.,among which mammograms have been used in early cancer to detect;however,manually segmenting the mass from the local mammogram is an very time-consuming and error-prone task.Therefore,an integrated computer aided diagnosis(CAD)is needed to help radiologists perform automatic and precise breast mass identification.Method In this work,we compared different image segmentation models based on the deep learning image segmentation framework.At the same time,on the based UNet structure,we adopt the Swin architec-ture to replace the downsampling and upsampling processes in the segmentation task,to realize the interaction between local and global features.At the same time we use a Transformer to obtain more global information and different hierarchi-cal features to replace short connections and realize multi-scale feature fusion to achieve accurate segmentation.In the seg-mentation model stage,we also use so as Multi-Attention ResNet classification network to identify the classification of can-cer regions Better diagnosis and treatment of breast cancer.During segmentation the Swin Transformer and atrous spatial pyramid pooling(ASPP)modules are used to replace the common convolution layer through analogy with the UNet struc-ture model.The shift window and multiple attention are used to achieve the integration of feature information inside the image slice and extract information complementarity between non-adjacent areas.At the same time,the ASPP structure can achieve self-attention of local information with an increasing receptive field.A Transformer structure is introduced to correlate information between different layers to prevent the loss of shallow layers of important information during downsam-pling convolution.The final architecture not only inherits advantages Transformer's in learning global semantic associa-tions,but also uses different levels of characteristics to preserve more semantics and more details in the model.As the input dataset of classification networks,binarized images obtained by the segmentation model can be used to identify differ-ent categories of breast cancer tumors.Based on ResNet50,this classification model adds multi-type attention modules and overfitting operations.squeeze-and-excitation(SE)and selective kernel(SK)attention can optimize network parameters,so that it only pays attention to the differences in segmentation regions improving the efficiency of the model.Thus proposed model by us achieved accurate segmentation of the lump on the breast cancer X-ray dataset INbreast,and we also compared it with five segmentation structures:UNet,UNet++,Res18_UNet,MultiRes_UNet,and Dense_UNet.After the segmenta-tion model,a more accurate binary map of the cancer region was obtained.Problems,such as feature information blending of different levels and self-concern of the local information of the convolutional layer,exist in up-sampling and downsam-pling based on the UNet structure.Therefore,the Swin Transformer structure,which has a sliding window operation and hierarchical design,is adopted.Window Attention is shifted mainly by the Window Attention module and the Shifted win-dow attention module,which enables the input feature graph to be sliced into multiple windows.The weight of each window is shifted in accordance with the shifted self-attention,and the position of the entire feature graph is shifted.It can realize the information interaction within the same feature graph.In upsampling and downsampling,we use four Swin Transformer structures.and in the process of fusion,we use the pyramid ASPP structure to replace the common feature graph channel addition operation,which can use multiple convolution check feature graphs and channel fusions,and the given input can be sampled in parallel with cavity convolution at different sampling rates.Achieve multiple scale capture image context information is obtained.In order to better integrate high-and low-dimensional spatial information,we propose a new multi-scale feature graph fusion strategy and use a Transformer with skip connections to enhance spatial domain information repre-sentation.Each cancer image was classified into normal,mass,deformation,and calcification according the introduction of the INbreast dataset.Each category was labeled and then sent to the classification network.The classification model we adopted takes ResNet50 as the baseline model.On this basis,two different kinds of attention,i.e.,SE and SK,are added.SK convolution replaces 3×3 convolution at every bottleneck.Thus,more image features can be extracted at the convolu-tional layer.Meanwhile SE belongs to channel attention,and each channel can be weighted before the pixel value is output-ted.Three methods,namely,Gaussian error gradient descent,label smoothing,and partial data enhancement,are intro-duced to improve the accuracy of the model.Result In the same parameter environment,the intersection over union(IoU)value reached 95.58%.Dice coefficient was 93.45%,which was 4%-6%higher than that of the other segmentation mod-els.The binary segmentation image is classified into four categories,and the Accuracy reached 95.24%.Conclusion Experiments show that our proposed TransAS-UNet image segmentation method demonstrates good performance and clinical significance which is superior to those of other 2D image medical segmentation methods.
breast cancerdeep learningmedical image segmentationTransAS-UNetimage classification