Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework
Drawing on Transformer's powerful feature encoding capabilities in the fields of natural language and computer vision,and inspired by a multi-stage learning framework,a point cloud upsampling network that incorporates Transformer and multi-stage learning framework is designed.The network adopts a two-stage network model,the first stage is a dense point generation network,using a multi-layer Transformer encoder to progressively transform the local geometric information and local feature in-formation of the input point cloud to the high-level semantic features of the point cloud,the feature expansion module upsamples the point cloud features in the feature space,the coordinate regression module remaps the point cloud from the feature space back to the Euclidean space to initially generate a dense point cloud.The second stage is the point-by-point optimisation network,using the Transformer encoder to encode the latent semantic features in the dense point cloud,and combining the semantic features from the previous stage to obtain the complete semantic features of the point cloud,the information integration module extracts the er-ror features of the points from the geometric information and semantic features of the dense point cloud,and the error regression module calculates the coordinate offset of the points in Euclidean space from the error features to realise the point-by-point op-timisation of the dense point cloud,so that the distribution of points on the point cloud is more uniform and closer to the real ob-ject surface.In extensive experiments on the large synthetic dataset PU1K,the high-resolution point clouds generated by MSPUiT are reduced to 0.501 × 10-3,5.958 × 10-3 and 1.756 × 10-3 in terms of Chamfer Distance(CD),Hausdorff Distance(HD)and distance from the generated point cloud to the original point cloud block(P2F),respectively.Experimental results show that the surface of the point cloud is smoother and less noisy after upsampling by MSPUiT,and the quality of the generated point cloud is higher than that of the current mainstream point cloud upsampling networks.