Super-Resolution Reconstruction of Spatiotemporal Fusion for Dual-Stream Remote Sensing Images Based on Swin Transformer
The spatiotemporal fusion super-resolution reconstruction of remote sensing images extracts information from low-resolution images with high temporal density and high-resolution images with low temporal resolution to generate remote sensing images with both high temporal and spatial resolutions.This process is directly related to the implementation of subsequent tasks such as interpretation,detection,and tracking.With the rapid advancement of Convolutional Neural Network(CNN),researchers have proposed a series of CNN-based spatiotemporal fusion methods.However,because of the inherent limitations of convolution operations,these methods still face challenges with respect to global information extraction.Inspired by the global modeling capabilities of the Swin Transformer,this paper proposes a super-resolution reconstruction model based on the Swin Transformer.In the feature extraction stage,a dual-stream structure is introduced,dividing the feature extraction network into two parts to extract temporal and spatial information separately.The performance of the model is enhanced by the global capabilities of the Swin Transformer.In the feature fusion stage,a Convolutional Block Attention Module(CBAM)that combines channel and spatial attention is introduced to enhance the important features and improve the image reconstruction accuracy.Comparative experiments are conducted on the Coleambally Irrigation Area(CIA)and Lower Gwydir Catchment(LGC)datasets using various spatiotemporal fused super-resolution reconstruction models.The results show that the proposed model achieved optimal performance across all evaluation metrics,demonstrating superior performance and enhanced generalization capabilities.