从单视遥感图像进行三维重建本身是一个解不唯一的非适定问题,往往需要大量的人工经验来补充缺失信息以构建完整三维模型.为了解决这一问题,提出了一种基于语义分割和融合残差U-Net的单视遥感影像三维重建方法.该方法包括语义分割和单视遥感影像高度估计两个阶段.语义分割阶段使用U-Net确定地物属性,在此基础上改进U-Net对遥感影像进行高度估计,并联合语义特征进行锚定高度回归以提高重建精度.针对改进U-Net,通过嵌入不同数量与通道的残差块,强化编码器的特征提取能力,并修改解码器输出层使其适应于高度回归任务,从而实现逐像素预测遥感影像的数字表面模型(Digital surface model,DSM)高度值.在公开的US3D数据集上得到了均方根误差(Root mean square error,RMSE)为2.751 m、平均绝对误差(Mean absolute error,MAE)为1.446 m的结果,重建结果均优于其余网络,证实该方法实现了基于单视遥感影像的三维估计,能够重建地物的分布结构.
Three-Dimensional Reconstruction Method for Single-View Optical Remote Sensing Images Based on Semantic Segmentation and Residual U-Net Fusion
Three-dimensional(3D)reconstruction from single-view remote sensing images is an unsolvable problem,which often requires a lot of manual experience to supplement the missing information to construct a complete 3D model.To solve this problem,a 3D reconstruction method of single-view remote sensing image based on semantic segmentation and fusion residual U-Net is proposed.The method includes two stages:Semantic segmentation and height estimation of single-view remote sensing images.In the semantic segmentation stage,U-Net is used to determine the property of ground objects.On this basis,U-Net is improved to estimate the height of remote sensing image.The anchoring height regression is combined with semantic features to improve the reconstruction accuracy.Specifically,in order to improve U-Net,the feature extraction capability of encoder is enhanced by embedding residual blocks with different numbers and channels,and the decoder output layer is modified to adapt to the height regression task,so as to achieve pixel-to-pixel prediction of digital surface model(DSM)height values of remote sensing images.The results of root mean square error(RMSE)of 2.751 m and mean absolute error(MAE)of 1.446 m are obtained on the published US3D data set,and the reconstructed results are superior to those of other networks,confirming that the method can realize 3D estimation based on single-view remote sensing images and can reconstruct the distribution structure of ground objects.
semantic segmentationdeep residual learningresidual U-Net fusionsingle-view 3D reconstruction