CNN and Scale Adaptive Transformer Fusion Network for Pavement Crack Segmentation
The development of pavement crack segmentation technology is crucial for assessing the safety and durability of civil infrastructure.However,accurately segmenting cracks of irregular shapes in complex and dynamic background environments remains a challenging task.To improve the segmentation performance,we propose a CNN and scale adaptive fusion network-based pavement crack segmentation method.Specifically,for the dual-encoder based on CNN and Transformer,we utilized scale-adaptive Transformer blocks,which integrate scale-adaptive multi-head attention and a detail-enhanced feed-forward network to effectively capture multi-scale features and enhance detail information.Additionally,we employed a global-local feature fusion module to aggregate the intermediate features from the middle layers of the dual-encoder.For the decoder,we designed a large kernel dual-attention module to enhance the detailed boundaries and mitigate the influence of background noise,achieving highly accurate crack segmentation.Finally,we combined the cross-entropy segmentation and Dice losses to optimize the network training process.We conducted comprehensive comparison and ablation experiments on the DeepCrack,Crack500,and CFTR478 datasets to demonstrate the effectiveness of the proposed method.The experimental results show that our method is superior to other methods and outperforms DTrc-Net and FAT-Net on the CFTR478 validation set by 1.58%and 1.82%mIoU,respectively.Furthermore,in complex scenes with low-light,rainy,and slippery conditions and roads with different materials,our method can still effectively identify and accurately segment the crack regions,maintaining clear boundaries.Moreover,our method is applicable to pavement crack segmentation in real campus scenarios,obtaining high-quality segmentations of pavement cracks,and has good practical application prospects.