智能系统学报2024,Vol.19Issue(4) :920-929.DOI:10.11992/tis.202304026

基于Transformer的多尺度遥感语义分割网络

Transformer-based multiscale remote sensing semantic segmentation network

邵凯 王明政 王光宇
智能系统学报2024,Vol.19Issue(4) :920-929.DOI:10.11992/tis.202304026

基于Transformer的多尺度遥感语义分割网络

Transformer-based multiscale remote sensing semantic segmentation network

邵凯 1王明政 2王光宇3
扫码查看

作者信息

  • 1. 重庆邮电大学 通信与信息工程学院,重庆 400065;重庆邮电大学 移动通信技术重庆市重点实验室,重庆 400065;重庆邮电大学 移动通信教育部工程研究中心,重庆 400065
  • 2. 重庆邮电大学 通信与信息工程学院,重庆 400065
  • 3. 重庆邮电大学 通信与信息工程学院,重庆 400065;重庆邮电大学 移动通信技术重庆市重点实验室,重庆 400065
  • 折叠

摘要

为了提升遥感图像语义分割效果,本文针对分割目标类间方差小、类内方差大的特点,从全局上下文信息和多尺度语义特征 2 个关键点提出一种基于Transformer的多尺度遥感语义分割网络(muliti-scale Trans-former network,MSTNet).其由编码器和解码器 2 个部分组成,编码器包含基于Transformer改进的视觉注意网络(visual attention network,VAN)主干和基于空洞空间金字塔池化(atrous spatial pyramid pooling,ASPP)结构改进的多尺度语义特征提取模块(multi-scale semantic feature extraction module,MSFEM).解码器采用轻量级多层感知器(multi-layer perception,MLP)配合编码器设计,充分分析所提取的包含全局上下文信息和多尺度表示的语义特征.MSTNet在 2 个高分辨率遥感语义分割数据集ISPRS Potsdam和LoveDA上进行验证,平均交并比(mIoU)分别达到 79.50%和 54.12%,平均F1-score(mF1)分别达到 87.46%和 69.34%,实验结果验证了本文所提方法有效提升了遥感图像语义分割的效果.

Abstract

For improving the semantic segmentation effect of remote sensing images,this paper proposes a Transformer based multi-scale Transformer network(MSTNet)based on the characteristics of small inter-class variance and large in-tra-class variance of segmentation targets,focusing on two key points:global contextual information and multi-scale se-mantic features.The MSTNet consists of an encoder and a decoder.The encoder includes an improved visual attention network(VAN)backbone based on Transformer and an improved multi-scale semantic feature extraction module(MS-FEM)based on atrous spatial pyramid pooling(ASPP)to extract multi-scale semantic features.The decoder is designed with a lightweight multi-layer perception(MLP)and an encoder,to fully analyze the global contextual information and multi-scale representations features extracted by utilizing the inductive property of transformer.The proposed MSTNet was validated on two high-resolution remote sensing semantic segmentation datasets,ISPRS Potsdam and LoveDA,achieving an average intersection over union(mIoU)of 79.50%and 54.12%,and an average F1-score(mF1)of 87.46%and 69.34%,respectively.The experimental results verify that the proposed method has effectively improved the se-mantic segmentation of remote sensing images.

关键词

遥感图像/语义分割/卷积神经网络/Transformer/全局上下文信息/多尺度感受野/编码器/解码器

Key words

remote sensing image/semantic segmentation/convolutional neural network/Transformer/global contextu-al information/multiscale receptive field/encoder/decoder

引用本文复制引用

出版年

2024
智能系统学报
中国人工智能学会 哈尔滨工程大学

智能系统学报

CSTPCD北大核心
影响因子:0.672
ISSN:1673-4785
被引量1
参考文献量7
段落导航相关论文