首页|融合混合注意力机制与多尺度特征增强的高分影像建筑物提取

融合混合注意力机制与多尺度特征增强的高分影像建筑物提取

扫码查看
由于复杂背景变换和建筑物形状多样化等因素影响,从高分辨率遥感图像中准确提取建筑物信息面临着挑战.该文提出了一种融合混合注意力机制与多尺度特征增强的高分辨率建筑物语义分割网络(building mining net,BMNet).首先,编码器部分使用VGG-16作为主干网络来提取特征,得到4层特征表示;然后设计解码器用于解决多尺度信息中高层特征的细节信息丢失问题,引入了混合通道注意力和空间注意力的串联注意力机制(series at-tention module,SAM),增强高层特征的表示能力;同时,设计了一种渐进式特征增强的建筑物信息挖掘模块(build-ing mining module,BMM),进一步提高建筑物分割的准确性.BMM把上采样后的特征映射、经过SAM处理的特征映射以及初始预测结果作为输入,获取背景噪声信息,并利用所设计的上下文信息探索模块滤除背景信息,在经过多次BMM处理后得到最佳预测结果.对比实验结果表明:BMNet在武汉大学建筑数据集上精度和交并比分别优于U-net 4.6%和4.8%,在马萨诸塞州建筑数据集和Inria航空图像标注数据集上精度和交并比分别优于U-net 7.9%,8.9%和6.7%,11.0%,验证了所提模型的有效性以及实用性.
Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement
Accurately extracting building information from high-resolution remote sensing images faces challenges due to complex background transformations and the diversity of building shapes.This study developed a high-resolution building semantic segmentation network-building mining net(BMNet),which integrated a hybrid attention mechanism with multi-scale feature enhancement.First,the encoder utilized VGG-16 as the backbone network to extract features,obtaining four layers of feature representations.Then,a decoder was designed to address the issue of detail loss in high-layer features within multi-scale information.Specifically,a series attention module(SAM),which combined channel attention and spatial attention,was introduced to enhance the representation capabilities of high-layer features.Additionally,the building mining module(BMM)with progressive feature enhancement was designed to further improve the accuracy of building segmentation.With the upsampled feature mapping,the feature mapping post-processed using SAM,and initial prediction results as input,the BMM output background noise information and then filtered out background information using the context information exploration module designed in this study.Optimal prediction results were achieved after multiple processing using the BMM.Comparative experiment results indicate that the BMNet outperformed U-Net,with accuracy and intersection over union(IoU)increasing by 4.6%and 4.8%,respectively on the WHU Building dataset,by 7.9%and 8.9%,respectively on the Massachusetts buildings dataset,and by 6.7%and 11.0%,respectively on the Inria Aerial Image Labeling Dataset.These results validate the effectiveness and practicality of the proposed model.

semantic segmentationhigh spatial resolution remote sensing imagebuilding information extractionU-netattention mechanismdilated convolution

曲海成、梁旭

展开 >

辽宁工程技术大学软件学院,葫芦岛 125105

语义分割 高分辨率遥感影像 建筑物提取 U-net 注意力机制 空洞卷积

2024

自然资源遥感
中国国土资源航空物探遥感中心

自然资源遥感

CSTPCD北大核心
影响因子:1.275
ISSN:2097-034X
年,卷(期):2024.36(4)