首页|基于Transformer的多尺度遥感语义分割网络

基于Transformer的多尺度遥感语义分割网络

扫码查看
为了提升遥感图像语义分割效果,本文针对分割目标类间方差小、类内方差大的特点,从全局上下文信息和多尺度语义特征 2 个关键点提出一种基于Transformer的多尺度遥感语义分割网络(muliti-scale Trans-former network,MSTNet).其由编码器和解码器 2 个部分组成,编码器包含基于Transformer改进的视觉注意网络(visual attention network,VAN)主干和基于空洞空间金字塔池化(atrous spatial pyramid pooling,ASPP)结构改进的多尺度语义特征提取模块(multi-scale semantic feature extraction module,MSFEM).解码器采用轻量级多层感知器(multi-layer perception,MLP)配合编码器设计,充分分析所提取的包含全局上下文信息和多尺度表示的语义特征.MSTNet在 2 个高分辨率遥感语义分割数据集ISPRS Potsdam和LoveDA上进行验证,平均交并比(mIoU)分别达到 79.50%和 54.12%,平均F1-score(mF1)分别达到 87.46%和 69.34%,实验结果验证了本文所提方法有效提升了遥感图像语义分割的效果.
Transformer-based multiscale remote sensing semantic segmentation network
For improving the semantic segmentation effect of remote sensing images,this paper proposes a Transformer based multi-scale Transformer network(MSTNet)based on the characteristics of small inter-class variance and large in-tra-class variance of segmentation targets,focusing on two key points:global contextual information and multi-scale se-mantic features.The MSTNet consists of an encoder and a decoder.The encoder includes an improved visual attention network(VAN)backbone based on Transformer and an improved multi-scale semantic feature extraction module(MS-FEM)based on atrous spatial pyramid pooling(ASPP)to extract multi-scale semantic features.The decoder is designed with a lightweight multi-layer perception(MLP)and an encoder,to fully analyze the global contextual information and multi-scale representations features extracted by utilizing the inductive property of transformer.The proposed MSTNet was validated on two high-resolution remote sensing semantic segmentation datasets,ISPRS Potsdam and LoveDA,achieving an average intersection over union(mIoU)of 79.50%and 54.12%,and an average F1-score(mF1)of 87.46%and 69.34%,respectively.The experimental results verify that the proposed method has effectively improved the se-mantic segmentation of remote sensing images.

remote sensing imagesemantic segmentationconvolutional neural networkTransformerglobal contextu-al informationmultiscale receptive fieldencoderdecoder

邵凯、王明政、王光宇

展开 >

重庆邮电大学 通信与信息工程学院,重庆 400065

重庆邮电大学 移动通信技术重庆市重点实验室,重庆 400065

重庆邮电大学 移动通信教育部工程研究中心,重庆 400065

遥感图像 语义分割 卷积神经网络 Transformer 全局上下文信息 多尺度感受野 编码器 解码器

2024

智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
年,卷(期):2024.19(4)
  • 1
  • 7