多重注意力和级联上下文糖网病病灶分割

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目的糖尿病视网膜病变(糖网病)(diabetic retinopathy,DR)是人类致盲的首要杀手,自动准确的糖网病病灶分割对于糖网病分级和诊疗至关重要.然而,不同类型的糖网病病灶结构复杂,大小尺度不一致且存在类间相似性和类内差异性,导致同时准确分割多种病灶充满挑战.针对上述问题,提出一种基于多重注意力和级联上下文融合的糖网病多类型病灶分割方法.方法首先,三重注意力模块提取病灶的通道注意力、空间注意力和像素点注意力特征并进行加法融合以保证病灶特征的一致性.另外,级联上下文特征融合模块采用自适应平均池化和非局部操作提取不同层网络的全局上下文信息以扩大病灶的感受野.最后,平衡注意力模块计算病灶前景、背景和边界注意力图,并利用挤压激励模块在特征通道之间加权以重新平衡3个区域的注意力,令网络更多关注病灶的边缘细节,实现精细化分割.结果在国际公开的糖网病图像数据集DDR(dataset for diabetic retinopathy)、IDRiD(Indian diabetic retinopathy image dataset)和E-Ophtha进行充分的对比实验和消融实验,4种病灶分割的平均AUC(area under curve)分别达到0.679 0、0.750 3和0.660 1.结论基于多重注意力和级联上下文融合的糖网病分割方法(multi-attention and cascaded context fusion network,MCFNet)能够克服其他眼底组织和病灶噪声的不良干扰,同时实现糖网病4种病灶的精准分割,具有较好的准确性和鲁棒性,为临床医生进行糖网病诊疗提供有力支持.

外文标题：MCFNet:multi-attention and cascaded context fusion network for segmentation multiple lesion of diabetic retinopathy images

外文摘要：Objective Diabetic retinopathy(DR)is a leading cause of blindness in humans,and regular screening is help-ful for its early detection and containment.While automated and accurate lesion segmentation is crucial for DR grading and diagnosis,this approach encounters many challenges due to the complex structures,inconsistent scales,and blurry edges of different kinds of lesions.However,the manual segmentation of DR lesions is time-consuming and labor-intensive,thus making the large-scale popularization of the approach particularly difficult due to the limited doctor resources and the high cost of manual annotation.Therefore,an automatic DR lesion segmentation method should be developed to reduce clinical workload and increase efficiency.Recently,convolutional neural networks have been widely applied in the fields of medi-cal image segmentation and disease classification.The existing deep-learning-based methods for DR lesion segmentation are classified into image-based and patch-based approaches.Some studies have adopted the attention mechanism to seg-ment lesions using the whole fundus image as input.However,these methods may lose the edge details of lesions,thus introducing challenges in obtaining fine-grained lesion segmentation results.Other studies have cropped the original images to patches and inputted them into the encoder-decoder networks for DR lesion segmentation.However,most of the approaches proposed in the literature utilize fixed weights to fuse coding features at different levels while ignoring the infor-mation differences among them,thus hindering the effective integration of multi-level features for accurate lesion segmenta-tion.To address these issues,this paper proposes a multi-attention and cascaded context fusion network(MCFNet)for the simultaneous segmentation of multiple lesions.Method The proposed network adopts an encoder-decoder framework,including the VGG16 backbone network,triple attention module(TAM),cascaded context fusion module(CFM),and balanced attention module(BAM).First,directly fusing multi-level features from different stages of the encoder easily results in inconsistent feature scales and information redundancy.Dynamically selecting important information from multi-resolution feature maps not only preserves contextual information in low-resolution feature maps but also effectively reduces background noise interference in high-resolution feature maps.TAM is proposed to extract three types of attention features,i.e.,channel attention,spatial attention,and pixel-point attention.Second,the channel attention assigns different weights to different feature channels to enable the selection of specific feature patterns for lesion segmentation.The spatial attention also highlights the location information of lesions in the feature map,thus making the proposed model pay atten-tion to lesion areas.Lastly,the pixel-point attention mechanism extracts small-scale lesion features.TAM ensures feature consistency and selectivity by learning and fusing these attention features.In addition,traditional receptive field ranges can hinder the capture of subtle features due to the small size of lesions.To address this problem,CFM is proposed to cap-ture global context information at different levels and to perform summation with local context information from TAM.The module is designed to expand the scope of multi-scale receptive fields and consequently improve the accuracy and robust-ness of small-scale lesion segmentation.This study also uses BAM to address the rough and inconspicuous lesion edges.This module calculates the foreground,background,and boundary attention map to reduce the adverse interference of the background noise and to clarify the lesion contour.Result The lesion segmentation performance of the proposed method was compared with that of extant methods on the IDRiD,DDR,and E-Ophtha datasets.Experimental results show that despite the variations in the number and appearances of retinal images from different countries and ethnicities,the proposed model outperforms the state-of-the-art in terms of accuracy and robustness.Specifically,on the IDRiD dataset,MCFNet achieves AUC values of 0.917 1,0.719 7,0.655 7,and 0.708 7 for lesion segmentation in the EX,HE,MA,and SE,respec-tively.The mAUC,mIOU,and mDice of four kinds of lesions on the IDRiD dataset reach 0.750 3,0.638 7,and 0.700 3,respectively.On the DDR dataset,the proposed model achieves mAUC,mIOU,and mDice values of 0.679 0,0.434 7,and 0.598 9 for these lesions.Compared with PSPNet,the proposed method obtains 52.7％,18.63％,and 33.06％higher mAUC,mIOU,and mDice values,respectively.On the E-Ophtha dataset,the proposed MCFNet achieves mAUC,mIOU,and mDice values of 0.660 1,0.449 5,and 0.628 5,respectively.When compared with MLSF-Net,these values improve by 15.11％,4.06％,and 20.68％,respectively.The segmentation performance of the proposed model was also compared with that of other methods.Compared with these methods,the segmentation results of the pro-posed model are closer to the ground truth,and the obtained edges are finer and more accurate.To verify the effectiveness of the proposed TAM,CFM,and BAM,comprehensive ablation experiments were conducted on the IDRiD,DDR,and E-Ophtha datasets.The proposed model obtained mAUC,mIOU,and mDice values of 0.597 5,0.451 2,and 0.584 8 on the IDRiD dataset when using only the baseline.The fusion of VGG16 with TAM,CFM,and BAM achieved the best seg-mentation results for all four types of multi-scale lesions,thereby suggesting that the proposed modules contribute to improv-ing the multiple lesion segmentation performance in various degrees.Conclusion This paper proposes a multi-attention and cascaded context fusion network for the multiple lesion segmentation of diabetic retinopathy images.The proposed MCFNet introduces TAM to learn and fuse channel attention,spatial attention,and pixel-point attention features to ensure feature consistency and selectivity.CFM utilizes adaptive average pooling and non-local operation to capture local and global con-textual features for concatenation fusion and to expand the receptive field of fundus lesions.BAM calculates attention maps for the foreground,background,and lesion contours and uses the squeeze-and-excitation modules to rebalance the attention features of these regions,preserve the edge details,and reduce interference from background noise.Experimental results on the IDRiD,DDR,and E-Ophtha datasets demonstrate the superiority of the proposed method compared with the state-of-the-art.This method also effectively overcomes the interference of background and other lesion noises,thus achieving an accurate segmentation of different types of multi-scale lesions.

外文关键词：

diabetic retinopathy(DR)multi-lesion segmentationtriple attentioncascaded context fusionbalanced attention

作者：

郭燕飞、杜杭丽、杨成龙、孔祥真

展开 >

作者单位：

曲阜师范大学计算机学院,日照 276827

关键词：

糖尿病视网膜病变(DR) 多病灶分割三重注意力级联上下文融合平衡注意力

出版年：

2024

DOI：

10.11834/jig.230827

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(12)