To address the issue of limited classification accuracy in remote sensing image scene classification,arising from the complex background and varying scales of scene objects,this paper introduces a remote sensing image scene classification model based on a depthwise separable multiscale dilated feature fusion network with an attention mechanism.Firstly,this model em-ploys a feature extraction module built on depthwise separable convolutions,allowing the extraction of deep-level image features while minimizing the parameter count.Subsequently,a multiscale dilated convolution module is used to expand the network's re-ceptive field,enabling the extraction of both global and contextual features from remote sensing images.Finally,the attention mechanism is used to make the network focus on important features,and the extracted features are input into a Softmax classifier for the purpose of classification.We validate the proposed model on two datasets,AID and WHU-RS19,for remote sensing scene classification.Experimental results demonstrate that,in comparison to baseline models such as AlexNet,VGG-16,and ResNet18,the proposed model achieves an accuracy improvement to 93.32%on AID and 91.15%on WHU-RS19,while main-taining a relatively lower parameter count.The proposed model holds significant theoretical implications for remote sensing image scene classification.
关键词
遥感图像场景分类/卷积神经网络/深度可分离卷积/多尺度/扩张卷积
Key words
remote sensing image scene classification/convolutional neural networks/depthwise separable convolution/multi-scale/expansion convolution