Buildings extraction from UAV images based on improved Mask R-CNN
Automatic building extraction from unmanned aerial vehicle (UAV) images is crucial for urban and rural planning and management. However, it poses challenges to instance extraction in complex background interference and highly variable building appearance. This paper proposed an improved Mask region-based convolutional neural network (R-CNN) method for automatic instance extraction of buildings from UAV images. The improved method used ResNet-101 as the feature extraction network, and the localization ability of the whole feature hierarchy was enhanced by adding bottom-up paths in terms of the feature fusion network. Meanwhile, the atrous spatial pyramid pooling (ASPP) module was added to the feature fusion to increase the multiscale ability and improve the model performance. The comprehensive experimental results on the self-made building dataset show that compared with the original Mask R-CNN method, the mAP value of the improved method is increased by 2.6%, which can well realize the building instance extraction from UAV images.