Abstract
© 2025 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS)Semantic segmentation of land transportation scenes is critical for infrastructure maintenance and the advancement of intelligent transportation systems. Unlike traditional large-scale scenes, land transportation environments present intricate structural dependencies among infrastructure elements and pronounced class imbalance. To address these challenges, we propose a Gaussian-enhanced positional encoding block that leverages the Gaussian function's intrinsic smoothing and reweighting properties to project relative positional information into a higher-dimensional space. By fusing this enhanced representation with the original positional encoding, the model gains a more nuanced understanding of spatial dependencies among infrastructures, thereby improving its capacity for semantic segmentation in complex land transportation scenes. Furthermore, we introduce the Multi-Context Interaction Module (MCIM) into the backbone network, varying the number of MCIMs across different network levels to strengthen inter-layer context interactions and mitigate error accumulation. To mitigate class imbalance and excessive object adhesion within the scene, we incorporate a boundary-aware class-balanced (BCB) hybrid loss function. Comprehensive experiments on three distinct land transportation datasets validate the effectiveness of our approach, with comparative analyses against state-of-the-art methods demonstrating its consistent superiority. Specifically, our method attains the highest mIoU (91.8%) and OA (96.7%) on the high-speed rail dataset ExpressRail, the highest mIoU (73.3%) on the traditional railway dataset SNCF, and the highest mF1-score (87.4%) on the urban road dataset Pairs3D. Codes are uploaded at: https://github.com/Kange7/CoBa.