Transformer Based Weakly-Supervised Semantic Segmentation of Remote Sensing Images
A Transformer based end-to-end image level weakly supervised semantic segmentation network is proposed to address the complex scene and high annotation cost of remote sensing image semantic segmentation tasks.The network first improves the accuracy and granularity of the class activation map through a multi class label encoding module;Then,the affinity pseudo label generation module is used to further re-fine the representation of affinity relationships,generating high-precision affinity pseudo labels as segmentation supervision information,thereby improving the ability of weakly supervised networks;Simultaneously designing a mixed label data augmentation module to enhance the composition of remote sensing data;Finally,a mixed loss function with fusion affinity loss is provided to enhance the learning performance of the network.The experimental results on the ISAID dataset show that the model achieves an mIoU of 38.836%in segmentation results using im-age level labels,demonstrating better robustness and reliability compared to the control network.It has high application value in weakly super-vised semantic segmentation of remote sensing images.