Channel-interaction multi-level fusion network for change detection by integrating CNN and Transformer
This paper proposes a novel Channel-Interaction Multi-Level Fusion Network(CIMLFNet)for change detection(CD)based on parallel architecture,which integrates the local characteristics of convolutional neural network(CNN)and the global characteristics of Transformer effectively.First,based on complementary CNN and Transformer,a tripe-channel feature extractor is designed to fully extract the spatial-temporal features of bi-temporal images.Second,a Pyramid Spatial-Temporal Cross-Attention Module(PSTCAM)is constructed to highlight the change information.PSTCAM leverages the features extracted by Channel 2 to enhance the features extracted by Channels 1 and 3.Then,to make full use of the advantages and complementarity of CNN and Transformer,a dual-branch channel-interaction multi-level fusion module is proposed.It fuses the enhanced features from the perspectives of level-priority and channel-priority,respectively.Finally,a simple yet effective boundary-region-enhancement classifier is proposed.On the four pubic CD datasets,namely,WHU、Google、GVLM and LEVIR,the F1/IoU values of the proposed CIMLFNet are 91.19%/83.80%、85.97%/75.40%、88.85%/79.94%and 90.07%/81.94%,respectively,which are significantly better than the six comparison methods.The experimental results confirm the effectiveness of CIMLFNet.