TransAS-UNet:融合Swin Transformer和UNet的乳腺癌区域分割

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的乳腺癌在女性中是致病严重且发病率较高的疾病,早期乳腺癌症检测是全世界需要解决的重要难题.如今乳腺癌的诊断方法有临床检查、影像学检查和组织病理学检查.在影像学检查中常用的方式是X光、CT(computed tomography)、磁共振等,其中乳房X光片已用于检测早期癌症,然而从本地乳房X线照片中手动分割肿块是一项非常耗时且容易出错的任务.因此,需要一个集成的计算机辅助诊断(computer aided diagnosis,CAD)系统来帮助放射科医生进行自动和精确的乳房肿块识别.方法基于深度学习图像分割框架,对比了不同图像分割模型,同时在UNet结构上采用了 Swin架构来代替分割任务中的下采样和上采样过程,实现局部和全局特征的交互.利用Transformer来获取更多的全局信息和不同层次特征来取代短连接,实现多尺度特征融合,从而精准分割.在分割模型阶段也采用了 Multi-Attention ResNet分类网络对癌症区域的等级识别,更好地对乳腺癌进行诊断医疗.结果本文模型在乳腺癌X光数据集INbreast上实现肿块的准确分割,IoU(intersection over union)值达到95.58％,Dice系数为93.45％,与其他的分割模型相比提高了 4％～6％,将得到的二值化分割图像进行四分类,Accuracy值达到95.24％.结论本文提出的TransAS-UNet图像分割方法具有良好的性能和临床意义,该方法优于对比的二维图像医学分割方法.

外文标题：TransAS-UNet:regional segmentation of breast cancer Swin Transformer and of UNet algorithm

外文摘要：Objective Breast cancer is a serious and high-morbidity disease in women.Early detection of breast cancer is an important problem that needs to be solved all over the world.The current diagnostic methods for breast cancer include clinical,imaging,and histopathological examinations.The commonly used methods in imaging examination are X-ray,computed tomography(CT),and magnetic resonance imaging.etc.,among which mammograms have been used in early cancer to detect;however,manually segmenting the mass from the local mammogram is an very time-consuming and error-prone task.Therefore,an integrated computer aided diagnosis(CAD)is needed to help radiologists perform automatic and precise breast mass identification.Method In this work,we compared different image segmentation models based on the deep learning image segmentation framework.At the same time,on the based UNet structure,we adopt the Swin architec-ture to replace the downsampling and upsampling processes in the segmentation task,to realize the interaction between local and global features.At the same time we use a Transformer to obtain more global information and different hierarchi-cal features to replace short connections and realize multi-scale feature fusion to achieve accurate segmentation.In the seg-mentation model stage,we also use so as Multi-Attention ResNet classification network to identify the classification of can-cer regions Better diagnosis and treatment of breast cancer.During segmentation the Swin Transformer and atrous spatial pyramid pooling(ASPP)modules are used to replace the common convolution layer through analogy with the UNet struc-ture model.The shift window and multiple attention are used to achieve the integration of feature information inside the image slice and extract information complementarity between non-adjacent areas.At the same time,the ASPP structure can achieve self-attention of local information with an increasing receptive field.A Transformer structure is introduced to correlate information between different layers to prevent the loss of shallow layers of important information during downsam-pling convolution.The final architecture not only inherits advantages Transformer's in learning global semantic associa-tions,but also uses different levels of characteristics to preserve more semantics and more details in the model.As the input dataset of classification networks,binarized images obtained by the segmentation model can be used to identify differ-ent categories of breast cancer tumors.Based on ResNet50,this classification model adds multi-type attention modules and overfitting operations.squeeze-and-excitation(SE)and selective kernel(SK)attention can optimize network parameters,so that it only pays attention to the differences in segmentation regions improving the efficiency of the model.Thus proposed model by us achieved accurate segmentation of the lump on the breast cancer X-ray dataset INbreast,and we also compared it with five segmentation structures:UNet,UNet++,Res18_UNet,MultiRes_UNet,and Dense_UNet.After the segmenta-tion model,a more accurate binary map of the cancer region was obtained.Problems,such as feature information blending of different levels and self-concern of the local information of the convolutional layer,exist in up-sampling and downsam-pling based on the UNet structure.Therefore,the Swin Transformer structure,which has a sliding window operation and hierarchical design,is adopted.Window Attention is shifted mainly by the Window Attention module and the Shifted win-dow attention module,which enables the input feature graph to be sliced into multiple windows.The weight of each window is shifted in accordance with the shifted self-attention,and the position of the entire feature graph is shifted.It can realize the information interaction within the same feature graph.In upsampling and downsampling,we use four Swin Transformer structures.and in the process of fusion,we use the pyramid ASPP structure to replace the common feature graph channel addition operation,which can use multiple convolution check feature graphs and channel fusions,and the given input can be sampled in parallel with cavity convolution at different sampling rates.Achieve multiple scale capture image context information is obtained.In order to better integrate high-and low-dimensional spatial information,we propose a new multi-scale feature graph fusion strategy and use a Transformer with skip connections to enhance spatial domain information repre-sentation.Each cancer image was classified into normal,mass,deformation,and calcification according the introduction of the INbreast dataset.Each category was labeled and then sent to the classification network.The classification model we adopted takes ResNet50 as the baseline model.On this basis,two different kinds of attention,i.e.,SE and SK,are added.SK convolution replaces 3×3 convolution at every bottleneck.Thus,more image features can be extracted at the convolu-tional layer.Meanwhile SE belongs to channel attention,and each channel can be weighted before the pixel value is output-ted.Three methods,namely,Gaussian error gradient descent,label smoothing,and partial data enhancement,are intro-duced to improve the accuracy of the model.Result In the same parameter environment,the intersection over union(IoU)value reached 95.58％.Dice coefficient was 93.45％,which was 4％-6％higher than that of the other segmentation mod-els.The binary segmentation image is classified into four categories,and the Accuracy reached 95.24％.Conclusion Experiments show that our proposed TransAS-UNet image segmentation method demonstrates good performance and clinical significance which is superior to those of other 2D image medical segmentation methods.

外文关键词：

breast cancerdeep learningmedical image segmentationTransAS-UNetimage classification

作者：

徐旺旺、许良凤、李博凯、周曦、律娜、詹曙

展开 >

作者单位：

合肥综合性国家科学中心人工智能研究院,合肥 230601

合肥工业大学计算机与信息学院,合肥 230601

安徽水利电力职业技术学院,合肥 231603

安徽医科大学第一附属医院,合肥 230022

展开 >

关键词：

乳腺癌深度学习医学图像分割 TransAS-UNet 图像分类

基金：

安徽省高等学校协同创新项目合肥市自然科学基金

项目编号：

GXXT-2022-0412021008

出版年：

2024

DOI：

10.11834/jig.230130

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(3)

参考文献量36