基于Transformer的多尺度遥感语义分割网络

扫码查看

原文链接

万方数据
维普

中文摘要：为了提升遥感图像语义分割效果,本文针对分割目标类间方差小、类内方差大的特点,从全局上下文信息和多尺度语义特征 2 个关键点提出一种基于Transformer的多尺度遥感语义分割网络(muliti-scale Trans-former network,MSTNet).其由编码器和解码器 2 个部分组成,编码器包含基于Transformer改进的视觉注意网络(visual attention network,VAN)主干和基于空洞空间金字塔池化(atrous spatial pyramid pooling,ASPP)结构改进的多尺度语义特征提取模块(multi-scale semantic feature extraction module,MSFEM).解码器采用轻量级多层感知器(multi-layer perception,MLP)配合编码器设计,充分分析所提取的包含全局上下文信息和多尺度表示的语义特征.MSTNet在 2 个高分辨率遥感语义分割数据集ISPRS Potsdam和LoveDA上进行验证,平均交并比(mIoU)分别达到 79.50％和 54.12％,平均F1-score(mF1)分别达到 87.46％和 69.34％,实验结果验证了本文所提方法有效提升了遥感图像语义分割的效果.

外文标题：Transformer-based multiscale remote sensing semantic segmentation network

外文摘要：For improving the semantic segmentation effect of remote sensing images,this paper proposes a Transformer based multi-scale Transformer network(MSTNet)based on the characteristics of small inter-class variance and large in-tra-class variance of segmentation targets,focusing on two key points:global contextual information and multi-scale se-mantic features.The MSTNet consists of an encoder and a decoder.The encoder includes an improved visual attention network(VAN)backbone based on Transformer and an improved multi-scale semantic feature extraction module(MS-FEM)based on atrous spatial pyramid pooling(ASPP)to extract multi-scale semantic features.The decoder is designed with a lightweight multi-layer perception(MLP)and an encoder,to fully analyze the global contextual information and multi-scale representations features extracted by utilizing the inductive property of transformer.The proposed MSTNet was validated on two high-resolution remote sensing semantic segmentation datasets,ISPRS Potsdam and LoveDA,achieving an average intersection over union(mIoU)of 79.50%and 54.12%,and an average F1-score(mF1)of 87.46%and 69.34%,respectively.The experimental results verify that the proposed method has effectively improved the se-mantic segmentation of remote sensing images.

外文关键词：

remote sensing imagesemantic segmentationconvolutional neural networkTransformerglobal contextu-al informationmultiscale receptive fieldencoderdecoder

作者：

邵凯、王明政、王光宇

展开 >

作者单位：

重庆邮电大学通信与信息工程学院,重庆 400065

重庆邮电大学移动通信技术重庆市重点实验室,重庆 400065

重庆邮电大学移动通信教育部工程研究中心,重庆 400065

关键词：

遥感图像语义分割卷积神经网络 Transformer 全局上下文信息多尺度感受野编码器解码器

出版年：

2024

DOI：

10.11992/tis.202304026

智能系统学报

中国人工智能学会　哈尔滨工程大学

智能系统学报

CSTPCD北大核心

影响因子：0.672

ISSN：1673-4785

年,卷(期)：2024.19(4)

被引量1
参考文献量7