A Remote Sensing Image Scene Classification Based on CNN-Transformer Semi-Supervised Cross Learning
With the development of deep learning technology,deep learning methods based on Convolutional Neural Networks(CNN)and Transformers have received extensive attention and research in fully supervised remote sensing image scene classification tasks.However,achieving good classification performance with lim-ited labeled samples remains challenging.Considering the differences in deep feature extraction methods between CNN and Transformers,a semi-supervised cross-learning method for remote sensing image scene clas-sification(SCL-CTNet)was proposed.By constructing consistency constraints on the outputs of CNN and Transformers,information from unlabeled data to guide model training would be better extracted.The semi-supervised cross learning method utilizes the output of weakly augmented images in one network as pseudo-labels to supervise the predictions of strongly augmented images in another network,fully leveraging the local-global information of unlabeled samples,encouraging consistency in predictions for the same input image between the two networks,and enhancing model generalization.Adaptive thresholding is used to filter pseudo-labels,improving their reliability.Experimental results on the AID and NWPU-RESISC45 datasets demon-strate the effectiveness of the proposed method.
high resolution remote sensing imagesscene classificationconvolutional neural networkstrans-formersemi-supervised learning