基于门控注意力和多尺度残差融合的双源遥感图像语义分割

Semantic Segmentation of Dual-Source Remote Sensing Images Based on Gated Attention and Multiscale Residual Fusion

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：遥感图像语义分割是基于地理对象进行遥感图像分析的关键和重要步骤.遥感影像数据与高程数据可形成有效的特征互补,进而提升像素级分割精度.以Swin Transformer为主干网络提取多尺度特征,融合自适应门控注意力机制和多尺度残差融合策略,提出双源遥感图像语义分割模型——STAM-SegNet.自适应门控注意力机制包含门控通道注意力机制和门控空间注意力机制.门控通道注意力通过竞争/合作的机制提升双源数据特征之间的相关性,有效提取双源数据的互补特征.门控空间注意力利用空间上下文信息动态地过滤掉部分高层语义特征,筛选出精确的细节特征.多尺度特征残差融合策略通过多尺度细化和残差结构充分捕获多尺度上下文信息,加强对阴影、边界等细节特征的关注,同时提升模型的训练速度.在Vaihingen和Potsdam数据集上进行实验,所提方法分别取得了89.66%和92.75%的平均F1-score,具有比DeepLabV3+、UperNet、DANet、TransUNet、Swin-UNet等网络更高的分割精度.

外文摘要：The semantic segmentation of remote sensing images is a crucial step in the analysis of geographic-object-based remote sensing images.Combining remote sensing image data with elevation data effectively enhances feature complementarity,thereby improving pixel-level segmentation accuracy.This study proposes a dual-source remote sensing image semantic segmentation model,STAM-SegNet,that leverages the Swin Transformer backbone network to extract multiscale features.The proposed model integrates an adaptive gating attention mechanism and a multiscale residual fusion strategy.The adaptive gated attention mechanism includes gated channel attention and gated spatial attention mechanisms.Gated channel attention enhances the correlation between dual-source data features through competition/cooperation mechanisms,effectively extracting complementary features of dual-source data.In contrast,gated spatial attention uses spatial contextual information to dynamically filter out high-level semantic features and select accurate detail features.The multiscale feature residual fusion strategy captures multiscale contextual information via multiscale refinement and residual structure,thereby emphasizing detailed features,such as shadows and boundaries,and improving the model's training speed.Experiments conducted on the Vaihingen and Potsdam datasets demonstrate that the proposed model achieved an average F1-score of 89.66%and 92.75%,respectively,surpassing networks such as DeepLabV3+,UperNet,DANet,TransUNet,and Swin-UNet in terms of segmentation accuracy.

外文关键词：

remote sensing image interpretationsemantic segmentationdual-source remote sensing dataadaptive gatingattention mechanismmultiscaleresidual fusion

作者：

郭文、杨虹、刘畅

展开 >

作者单位：

北京信息科技大学理学院,北京 100029

北京信息科技大学应用数学研究所,北京 100101

关键词：

遥感图像解译语义分割双源遥感数据自适应门控注意力机制多尺度残差融合

基金：

国家自然科学基金北京市自然科学基金

项目编号：

621710444222104

出版年：

2024

DOI：

10.3788/LOP240534

激光与光电子学进展

中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCD北大核心

影响因子：1.153

ISSN：1006-4125

年,卷(期)：2024.61(18)

参考文献量8