Remote Sensing Small Target Detection Based on Multimodal Fusion
This paper proposes a remote sensing small target detection algorithm based on multimodal fusion to address the problems of high similarity between detection targets and background,inaccurate target location,and feature extraction challenges.The feature extraction component utilizes multimodal fusion to extract shared and specific information across different modalities,complementing the information between different modes,and enhancing the model's information extraction capabilities.In the feature fusion component a receptive field spatial attention convolution is implemented to accurately perceive the spatial positions within the feature map and prioritize the importance of each feature in the receptive field.For the prediction component the Shape-intersection over union border regression loss function is used.This function considers not only the geometric relationship between the ground truth and the prediction boxes but also the inherent characteristics of the bounding box to enhance the regression accuracy.Experimental evaluations on the VEDAI and NWPU datasets demonstrate that the enhanced algorithm achieves mean average precisions of 72.83%and 93.5%,respectively,surpassing the baseline model by 8.40 percentage points and 2.7 percentage points.Compared to other advanced algorithms,the proposed algorithm effectively reduces both the false detection and missed detection rates.
multimodal fusionreceptive fieldspatial position informationborder regressiongeometric relationship