U-Net通道变换网络在腺体图像分割中的应用

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的腺体医学图像分割是将医学图像中的腺体区域与周围组织分离出来的过程,对分割精度有极高要求.传统模型在对腺体医学图像分割时,因腺体形态多样性和小目标众多的特点,容易出现分割不精细或误分割等问题,对此根据腺体医学图像的特点对U-Net型通道变换网络分割模型进行改进,实现对腺体图像更高精度分割.方法首先在U-Net型通道变换网络的编码器前端加入 ASPP_SE(spatial pyramid pooling_squeeze-and-excitation net-works)模块与ConvBatchNorm模块的组合,在增强编码器提取小目标特征信息能力的同时,防止模型训练出现过拟合现象.其次在编码器与跳跃连接中嵌入简化后的密集连接,增强编码器相邻模块特征信息融合.最后在通道融合变换器(channel cross fusion with Transformer,CCT)中加入细化器,将自注意力图投射到更高维度,提高自注意机制能力,增强编码器全局模块特征信息融合.简化后的密集连接与CCT结合使用,模型可以达到更好效果.结果改进算法在公开腺体数据集 MoNuSeg(multi-organ nuclei segmentation challenge)和 Glas(gland segmentation)上进行实验.以Dice系数和IoU(intersection over union)系数为主要指标,在MoNuSeg的结果为80.55％和67.32％,在Glas数据集的结果为92.23％和86.39％,比原U-Net型通道变换网络分别提升了 0.88％、1.06％和1.53％、2.43％.结论本文提出的改进算法在腺体医学分割上优于其他现有分割算法,能满足临床医学腺体图像分割要求.

外文标题：Application of U-Net channel transformation network in gland image segmentation

外文摘要：Objective Adenocarcinoma is a malignant tumor originating from the glandular epithelium and poses immense harm to human health.With the rapid development of computer vision technology,medical imaging has become an impor-tant means for expert preoperative diagnosis.In the diagnosis of adenocarcinoma,doctors judge the severity of the cancer and grade it by analyzing the size,shape,and other external features of the glandular structure.Accordingly,achieving high-precision segmentation of glandular images has become an urgent requirement in clinical medicine.Glandular medi-cal image segmentation refers to the process of separating the glandular region from the surrounding tissue in medical images,requiring high segmentation accuracy.Traditional models for segmenting glandular medical images can suffer from such problems as imprecise segmentation and mis-segmentation owing to the diverse shapes of glands and presence of numerous small targets.To address this issue,this study proposes an improved glandular medical image segmentation algo-rithm based on UCTransNet.UCTransNet addresses solves the semantic gap between different resolution modules of the encoder and between the encoder and decoder,thereby achieving high precision image segmentation.Method First,a com-bination of the fusion of ASPP_SE and ConvBatchNorm modules is added to the front end of the encoder.The ASPP_SE module combines the ASPP module and channel attention mechanism.The ASPP module consists of three different dilation rates of atrous convolution,a 1 × 1 convolution,and an ASPP pooling.Atrous convolution injects holes into standard con-volution to expand the receptive field,obtain dense data features,and maintain the same output feature map size.The ASPP module uses multi-scale atrous convolution to obtain a large receptive field,and fuses the obtained features with the global features obtained from the ASPP pooling to obtain denser semantic information than the original features.The chan-nel attention mechanism enables the model to focus considerably on important channel regions in the image,dynamically select information in the image,and give substantial weight to channels containing important information.In the CCT(channel cross fusion with Transformer),modules with higher weight of important information will achieve better fusion.The ConvBatchNorm module enhances the ability of the encoder to extract the features of small targets,while preventing overfitting during model training.Second,a simplified dense connection is embedded between the encoder and the skip connections,and the CCT in the model performs global feature fusion of the features extracted by the encoder from a chan-nel perspective.Although the global attention ability of the CCT is strong,its problem is a weak local attention ability,and the ambiguity between adjacent encoder modules has not been solved.To solve this problem,a dense connection is added to enhance the local information fusion ability.The dense connection passes the upper encoder module through convolution pooling to obtain the lower encoder module and performs upsampling on the lower encoder to make its resolution consistent with the upper encoder module.The two encoder modules are concatenated on the channel,and the resolution does not change after concatenation.After concatenation,the upper encoder module obtains the feature information supplement of the lower encoder module.Consequently,the semantic fusion between adjacent modules is enhanced,the semantic gap between adjacent encoder modules is reduced,and the feature information fusion between adjacent encoder modules is improved.A refiner is added to the CCT,which projects the self-attention map to a higher dimension,and uses the head convolution to enhance the spatial context and local patterns of the attention map.This method effectively combines the advantages of self-attention and convolution to further improve the self-attention mechanism.Lastly,a linear projection is used to restore the attention map to the initial resolution,thereby enhancing the global feature information fusion of the encoder.A fusion ASPP_SE and ConvBatchNorm modules are added to the front end of the UCTransNet encoder to enhance its ability to extract small target features and prevent overfitting.Second,a simplified dense connection is embed-ded between the encoder and skip connection to enhance the fusion of adjacent module features.Lastly,a refinement mod-ule is added to the CCT to project the self-attention map to a markedly high dimension,thereby enhancing the global feature fusion ability of the encoder.The combination of the simplified dense connection and CCT refinement module improves the performance of the model.Result The improved algorithm was tested on the publicly available gland data sets MoNuSeg and Glas.The Dice and intersection over union(IoU)coefficients were the main evaluation metrics used.The Dice coefficient is a similarity measure used to represent the similarity between two samples.By contrast,the IoU coefficient is a standard used to measure the accuracy of the result's positional information.Both metrics are commonly used in medical image seg-mentation.The test results on the MoNuSeg data set were 80.55％and 67.32％,while those on the Glas data set were 92.23％and 86.39％.These results represent improvements of 0.88％and 1.06％,and 1.53％and 2.43％,respec-tively,compared those of the original UCTransNet.The improved model was compared to existing popular segmentation networks and was found to generally outperform them.Conclusion The proposed improved model is superior to existing seg-mentation algorithms in medical gland segmentation and can meet the requirements of clinical medical gland image segmen-tation.The CCT module in the original model was further optimized to fuse global and local feature information,thereby achieving better results.

外文关键词：

medical image segmentationU-Net from a channel-wise perspective with Transformer(UCTransNet)dense connectionself-attention mechanismrefinement module

作者：

曹伟杰、段先华、许振伟、盛帅

展开 >

作者单位：

江苏科技大学计算机学院,镇江 212100

关键词：

医学图像分割 U-Net型通道变换网络(UCTransNet) 密集连接注意力机制细化器

基金：

国家自然科学基金

项目编号：

62276118

出版年：

2024

DOI：

10.11834/jig.230233

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(3)

参考文献量24