ADC-CPANet:A remote sensing image classification method based on local-global feature fusion
The rapid development of remote sensing technologies,such as satellites and unmanned aerial vehicles,has led to a surge in the amount and types of high-resolution remote sensing images.This advancement marks the onset of the"era of remote sensing big data."Compared with low-resolution ones,high-resolution remote sensing images provide richer texture,detailed information,and a more complex structure,making them crucial for applications like urban planning.However,images within the same category can vary substantially,whereas images from different categories may appear similar.Therefore,multi-scale feature extraction is important for remote sensing image scene classification.Current methods for remote sensing image scene classification can be divided into two categories according to the feature representation:those based on manual design features and those based on deep learning.Those based manual design features cover scale-invariant feature transformation and gradient scale histogram.They can achieve good results for simple classification tasks,but the feature information they extract may be incomplete or redundant,so the accuracy of classification in complex scenes remains low.By contrast,the methods based on deep learning have made incredible progress in scene classification owing to their powerful feature extraction ability.Compared with traditional methods,Convolution Neural Networks(CNNs)are commonly used in visual tasks,particularly those that involve more complex connections and diverse convolution forms.CNNs are effective at extracting local features,but they struggle with capturing long-distance dependencies among features.The Transformer architecture,which has recently been applied to computer vision,addresses this limitation through its self-attention layer that enables global feature extraction.Recent studies show that hybrid architectures combining CNNs and Transformers can utilize their advantages.This study proposes an Aggregation Depth-wise Convolution(ADC)module and a Convolution Parallel Attention(CPA)module.The ADC module effectively extracts local feature information and enhances the robustness of the model to image flipping and rotation.The CPA module integrates global and local feature extraction,with a multi-group convolution head decomposition designed to expand the receptive field and enhance feature extraction capacity.A remote sensing image scene classification model called ADC-CPANet is designed on the basis of two modules.The ADC and CPA modules are stacked at each stage of the model,improving its ability to extract global and local features.The effectiveness of ADC-CPANet is validated using the RSSCN7 and Google Image datasets.Experimental results demonstrate that ADC-CPANet achieves classification accuracies of 96.43%on the RSSCN7 dataset and 96.04%on the Google Image dataset,outperforming other advanced models.ADC-CPANet excels in extracting global and local features,achieving competitive scene classification accuracy.
remote sensing imagescene classificationconvolutional neural networkTransformerMulti-Gconv Head Decomposition AttentionADC-CPANet model