LAG-MANet model for remote sensing image scene classification
In the process of remote sensing image classification,both local and global information are crucial.At present,the methods for remote sensing image scene classification mainly include convolutional neural networks(CNN)and Transformers.While CNN has advantages in extracting local information,it has certain limitations in extracting global information.Compared with CNN,Transformer performs well in extracting global information,but has high computational complexity.To improve the performance of scene classification for remote sensing images while reducing complexity,a pure convolutional network called LAG-MANet is designed.This network focuses on both local and global features,taking into account multiple scales of features.Firstly,after inputting the pre-processed remote sensing images,multi-scale features are extracted by a multi-branch dilated convolution block(MBDConv).Then it enters four stages of the network in turn,and in each stage,local and global features are extracted and fused by different branches of the parallel dual-domain feature fusion block(P2DF).Finally,the classification labels are pooled by global average before being output by the fully connected layer.The classification accuracy of LAG-MANet is 97.76%on the WHU-RS19 dataset,97.04%on the SIRI-WHU dataset and 97.18%on the RSSCN7 dataset.The experimental results on three challenging public remote sensing datasets show that the LAG-MANet proposed in this paper is superior.