Building extraction from high-resolution images using a hybrid attention mechanism combined with multi-scale feature enhancement
Accurately extracting building information from high-resolution remote sensing images faces challenges due to complex background transformations and the diversity of building shapes.This study developed a high-resolution building semantic segmentation network-building mining net(BMNet),which integrated a hybrid attention mechanism with multi-scale feature enhancement.First,the encoder utilized VGG-16 as the backbone network to extract features,obtaining four layers of feature representations.Then,a decoder was designed to address the issue of detail loss in high-layer features within multi-scale information.Specifically,a series attention module(SAM),which combined channel attention and spatial attention,was introduced to enhance the representation capabilities of high-layer features.Additionally,the building mining module(BMM)with progressive feature enhancement was designed to further improve the accuracy of building segmentation.With the upsampled feature mapping,the feature mapping post-processed using SAM,and initial prediction results as input,the BMM output background noise information and then filtered out background information using the context information exploration module designed in this study.Optimal prediction results were achieved after multiple processing using the BMM.Comparative experiment results indicate that the BMNet outperformed U-Net,with accuracy and intersection over union(IoU)increasing by 4.6%and 4.8%,respectively on the WHU Building dataset,by 7.9%and 8.9%,respectively on the Massachusetts buildings dataset,and by 6.7%and 11.0%,respectively on the Inria Aerial Image Labeling Dataset.These results validate the effectiveness and practicality of the proposed model.