Remote sensing image semantic segmentation network based on global information extraction and reconstruction
A network for multi-scale attention extraction and global information reconstruction was proposed in order to enhance the segmentation of remote sensing scene images for downstream tasks. A multi-scale convolutional attention backbone was introduced into the remote sensing deep learning semantic segmentation model in the encoder. Multi-scale convolutional attention can capture multi-scale information and provide richer global deep and shallow information to the decoder. A global multi-branch local Transformer block was designed in the decoder. Multi-scale channel-wise striped convolution reconstructed multi-scale spatial context information,compensating for the spatial information fragmentation in the global branch. The global information segmentation map was reconstructed together with global semantic context information. A polarized feature refinement head was designed at the end of the decoder. A combination of softmax and sigmoid was used to construct a probability distribution function on the channel,which fitted a better output distribution,repaired potential high-resolution information loss in shallow layers,guided and integrated deep information. Then fine spatial texture was obtained. The experimental results showed that high accuracy was achieved by the network,with a mean intersection over union (MIoU) of 82.9% on the ISPRS Vaihingen dataset and 87.1% on the ISPRS Potsdam dataset.
semantic segmentationTransformermulti-scale convolutional attentionglobal multi-branch local attentionglobal information reconstruction