Objective Pneumonia is one of the most common and fatal diseases during childhood.Accurate segmentation of lung CT images is crucial for early detection.However,manually outlining infected lung regions is labor-intensive for radiologists.Automatic segmentation technology holds significant promise in alleviating the strain on medical resources.In childhood pneumonia CT images,infected areas are often fragmented across different lung lobes.Therefore,precise global contextual information is essential for accurate segmentation.While purely transformer-based segmentation networks have demonstrated strong learning capabilities in this regard,they often struggle with producing high-quality local details due to limited patch size and insufficient local prior knowledge.Moreover,lung tissues such as the hilum and mediastinal areas closely resemble infected regions in childhood pneumonia,which demands robust network performance to minimize interference.To address these challenges,we propose a prior graph convolution and transformer fusion network based on U-Net(GTU-Net).Methods The core concept of GTU-Net involves integrating graph convolutional network(GCN)and Transformer to mutually enhance each other's strengths.It utilizes GCN to establish pixel relationships within each patch,and then leverages the Transformer to capture global information between patches.In addition,a novel method called prior graph learning(PGL)is introduced within GCN to mitigate interference from irrelevant regions.GTU-Net comprises three main modules:PGL,graph convolution mixed transformer(GCT),and the encoder-decoder structure of U-Net,as illustrated in Fig.2.Upon receiving features extracted by the encoder,these are first processed using coordinate-aware projection(Fig.3)to form graph nodes and adjacency matrices.Subsequently,the adjacency matrices undergo further refinement through the PGL module(Fig.4),which uses a supervised approach to incorporate category priors and localization information from labels.The data is then divided into non-overlapping subgraphs.With PGL's assistance,the local reasoning capabilities of GCN are significantly enhanced,enabling precise descriptions of intra-class and inter-class feature relationships.This design is referred to as a prior graph convolutional network(PriorGCN).Next,the divided graph data are fed into the GCT module,which consists of PriorGCN and Vision Transformer(ViT).GCT aims to sequentially establish intra-patch localization and inter-patch globalization,thereby addressing challenges posed by complex local structures and scattered infection regions in childhood pneumonia.Finally,the decoder performs upsampling to produce the final segmentation result.Results and Discussions One private childhood pneumonia dataset(Child-P)and two publicly available COVID-19 CT datasets(COVID and MosMed)are used to validate the proposed GTU-Net.The ablation results indicate that each proposed module noticeably boosts segmentation performance(Table 2).Specifically,PriorGCN contributes the most,with improvements of 4.44 percentage points in DSC,6.82 percentage points in JI,6.31 percentage points in SE,4.41 percentage points in MCC,and a reduction of 0.1615 pixel in ASD compared to the baseline.In comparative experiments,GTU-Net achieves the best performance across all metrics on the Child-P dataset(Table 4),particularly excelling in JI and MCC metrics with improvements of 2.91 percentage points and 1.85 percentage points,respectively,compared to the second-best network.Moreover,GTU-Net demonstrates superior sensitivity in segmenting fragmented and tiny lesions,resulting in more comprehensive segmentation outcomes in these regions compared to other networks(Fig.10).Similarly,GTU-Net shows the best performance on the CO VID dataset(Table 5),particularly notable in the improvement of the SE metric,highlighting the excellent feature discrimination capability of the PGL module.GTU-Net also outperforms other networks in DSC,JI,and MCC metrics on the MosMed dataset,achieving improvements of 1.70 percentage points,1.77 percentage points,and 1.93 percentage points,respectively,compared to the second-best network(Table 4).Visualization results from the two COVID-19 datasets reveal that GTU-Net effectively addresses issues such as under-segmentation or over-segmentation(Fig.11).Additionally,GTU-Net exhibits superior local segmentation results,avoiding the checkerboard artifact often seen in transformer-based networks(Fig.12).Importantly,GTU-Net maintains its superior performance even when it is trained on small datasets without pre-training on larger datasets(Fig.13).Conclusions We select childhood pneumonia as our research focus,an area relatively underexplored in existing studies.We propose a novel GTU-Net to address the segmentation challenges presented by childhood pneumonia CT images,which are characterized by high noise interference,the presence of tiny lesions,and fragmented distribution.GTU-Net incorporates a GCT module to systematically capture local-global information.Additionally,a PGL module is introduced to construct a high-quality graph adjacency matrix for GCN,enhancing the network's ability to discriminate between inter-class and intra-class features.Unlike most existing transformer-based segmentation networks,GTU-Net does not rely on pre-training,which strengthens its clinical applicability.Experimental results on a private childhood pneumonia CT dataset demonstrate that GTU-Net outperforms state-of-the-art transformer networks.Furthermore,it exhibits strong performance on two publicly available COVID-19 CT datasets,verifying its generalizability.