3D multi-organ segmentation network combining local and global features and multi-scale interaction
Objective Highly conformal radiotherapy is a widely adopted cancer treatment modality requiring meticulous characterization of cancer tissues and comprehensive delineation of the surrounding anatomical structures.The efficacy and safety of this technique depend generally on the ability to precisely target the tumor,necessitating a thorough understanding of the corresponding organ-at-risk anatomy.Thus,accurate and detailed depiction of the neoplastic and adjacent normal tis-sues using advanced imaging techniques is critical in optimizing the outcomes of highly conformal radiotherapy.Given the current inadequacy of conventional segmentation methods in achieving accurate and efficient delineation of multi-organ structures from 3D medical images,there exists a promising opportunity for research on developing precise and automated segmentation techniques using deep learning approaches.By leveraging the capacity of deep neural networks(DNNs)to learn complex hierarchical representations from vast amounts of labeled data,this technique can facilitate the identification and extraction of specific features and patterns from medical images,leading to considerably reliable and efficient segmen-tation outcomes.This method could significantly enhance the clinical utility of imaging data in various diagnostic and thera-peutic applications,including but not limited to radiation therapy planning,surgical navigation,and disease assessment.Over the past few years,there has been increasing interest in exploring the benefits of integrating vision Transformer(ViT)with convolutional neural networks(CNNs)to enhance the quality and accuracy of semantic segmentation tasks.One prom-ising research direction that has emerged involves addressing the issue of multi-scale representation,which is critical for achieving robust and precise segmentation results on various medical imaging datasets.However,current state-of-the-art methods have failed to fully maximize the potential of multi-scale interaction between CNNs and ViTs.For example,some methods completely disregard multi-scale structures or achieve it by limiting the computational scope of ViTs.Other meth-ods rely solely on CNN or ViT at the same scale,disregarding their complementary advantages.In addition,the existing multi-scale interaction methods often neglect the spatial association between two-dimensional slices,resulting in poor per-formance in processing volume data.Therefore,further research is needed to solve the aforementioned problems.Method This research aims to address the limitations of existing methods for multi-organ segmentation in 3D medical images by pro-posing a new approach.By recognizing the importance of simultaneously determining local and global features at the same scale,a universal feature encoder known as the LoGoF module is introduced for use in multi-organ segmentation networks.This method enables the creation of an end-to-end 3D medical image multi-organ segmentation network(denoted as M0),which leverages the LoGoF module.To further enhance the model's ability to determine complex relationships between organs at different scales,a multi-scale interaction module and an attention-guided structure are incorporated into M0.These novel techniques introduce spatial priors into the features extracted at different scales,enabling M0 to accurately per-ceive inter-organ relationships and identify organ boundaries.By leveraging the preceding advanced components,the pro-posed model,called LoGoFUNet,enables robust and efficient multi-organ segmentation in 3D medical images.Overall,this approach represents a significant step forward in advancing the accuracy and efficiency of multi-organ segmentation in clinical applications.Result In experiments conducted on two well-known medical imaging datasets(i.e.,Synapse and SegTHOR),LoGoFUNet demonstrated impressive gains in accuracy over the second-best performing model.Compared with the runner-up,LoGoFUNet achieved a 2.94%improvement in the Dice similarity coefficient on the Synapse dataset,and a 4.93%improvement on the SegTHOR dataset.Furthermore,the 95th percentile Hausdorff distance index showed a significant decrease of 8.55 and 2.45 on Synapse and SegTHOR,respectively,indicating an overall improvement in multi-organ segmentation performance.On the ACDC dataset,the applicability of the 3D segmentation method is mostly poor,but LoGoFUNet still obtains better results than the 2D advanced method.This result indicates LoGoFUNet's superior adaptability and versatility to different types of datasets.These findings suggest that LoGoFUNet is a highly competitive and robust framework for accurate multi-organ segmentation in various clinical settings.This study conducts further ablation experiments to provide additional evidence supporting the effectiveness of and justification for LoGoFUNet.These experi-ments serve to verify the role and contribution of each of the proposed components,including the LoGoF encoder,multi-scale interaction module,and attention-guidance structure,in achieving the superior segmentation performance observed with LoGoFUNet.By systematically removing and evaluating the impact of each component on segmentation accuracy,these experiments confirm that the proposed module design is rational and effective.Thus,results of the ablation experi-ments further reinforce the value and potential clinical significance of adopting the LoGoFUNet framework for multi-organ segmentation in 3D medical imaging applications.Conclusion The experimental evaluation of the proposed segmentation model suggests that it effectively integrates information exchange within and between different scales.This outcome leads to improved segmentation performance and superior generalization capabilities on the dataset.By facilitating the interaction of multi-scale representations and leveraging novel techniques,such as intra-and inter-scale information exchange mecha-nisms,this approach enables the model to accurately determine complex spatial relationships and produce high-quality seg-mentations across a range of 3D medical imaging datasets.Findings highlight the importance of multi-scale features and information exchange in achieving robust and accurate medical image segmentation results.Lastly,results suggest that the proposed framework could provide significant benefits in a variety of clinical applications.