Medical Image Segmentation Based on Multi-Scale Convolution Modulation
Currently,more and more medical image segmentation models are using Transformer as their basic struc-ture.However,the computational complexity of the Transformer model is quadratic with respect to the input sequence,and it requires a large amount of data for pre-training in order to achieve good results.In situations where there is insufficient da-ta,the Transformer's advantages cannot be fully realized.Additionally,the Transformer often fails to effectively extract lo-cal information from images.In contrast,convolutional neural networks can effectively avoid these two problems.In order to fully leverage the strengths of both convolutional neural networks and Transformers and further explore the potential of convolutional neural networks,this paper proposes a multi-scale convolution modulation network(MSCMNet)model.This model incorporates the design methodology of visual Transformer models into traditional convolutional networks.By using convolution modulation and multi-scale feature extraction strategies,a feature extraction module based on multi-scale con-volution modulation(MSCM)is constructed.Efficient patch combination and patch decomposition strategies are also pro-posed for downsampling and upsampling of feature maps,respectively,further enhancing the model's representation abili-ty.The mDice scores obtained on four different types and sizes of medical image segmentation datasets-multiple organs in the abdomen,heart,skin cancer,and nucleus-are 0.805 7,0.923 3,0.923 9 and 0.854 8,respectively.With lower computa-tional complexity and parameter count,MSCMNet achieves the best segmentation performance,providing a novel and effi-cient model structure design paradigm for convolutional neural networks and Transformers in the field of medical image segmentation.
medical image segmentationmulti-scaleconvolutional modulationTransformer