融合Transformer与多阶段学习框架的点云上采样网络

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：借鉴Transformer在自然语言和计算机视觉领域强大的特征编码能力,同时受多阶段学习框架的启发,设计了一种融合Transformer与多阶段学习框架的点云上采样网络——MSPUiT.该网络采用二阶段网络模型,第一阶段是密集点生成网络,利用多层Transformer编码器逐步实现从输入点云的局部几何信息、局部特征信息到点云高级语义特征的转换,特征扩充模块在特征空间中,对点云特征上采样,坐标回归模块将点云从特征空间重新映射回欧氏空间中初步生成密集点云M';第二阶段是逐点优化网络,使用Transformer编码器对密集点云M'中潜藏的语义特征进行编码,联合上一阶段语义特征得到点云完整的语义特征,特征精炼单元从M'的几何信息和语义特征中提取点的误差信息特征,误差回归模块从误差信息特征中计算得到欧氏空间中点的坐标偏移量,实现对点云M'的逐点优化,使得点云上点的分布更加均匀,并且更加贴近真实物体表面.在大型合成数据集PU1K上进行了大量实验,MSPUiT生成的高分辨率点云在倒角距离(CD)、豪斯多夫距离(HD)、生成点云到原始点云块的距离(P2F)上的指标分别降至0.501×10-3,5.958×10-3,1.756×10-3.实验结果表明,MSPUiT上采样后的点云表面更加光滑,噪声点更少,生成的点云质量高于当前主流的点云上采样网络.

外文标题：Point Cloud Upsampling Network Incorporating Transformer and Multi-stage Learning Framework

外文摘要：Drawing on Transformer's powerful feature encoding capabilities in the fields of natural language and computer vision,and inspired by a multi-stage learning framework,a point cloud upsampling network that incorporates Transformer and multi-stage learning framework is designed.The network adopts a two-stage network model,the first stage is a dense point generation network,using a multi-layer Transformer encoder to progressively transform the local geometric information and local feature in-formation of the input point cloud to the high-level semantic features of the point cloud,the feature expansion module upsamples the point cloud features in the feature space,the coordinate regression module remaps the point cloud from the feature space back to the Euclidean space to initially generate a dense point cloud.The second stage is the point-by-point optimisation network,using the Transformer encoder to encode the latent semantic features in the dense point cloud,and combining the semantic features from the previous stage to obtain the complete semantic features of the point cloud,the information integration module extracts the er-ror features of the points from the geometric information and semantic features of the dense point cloud,and the error regression module calculates the coordinate offset of the points in Euclidean space from the error features to realise the point-by-point op-timisation of the dense point cloud,so that the distribution of points on the point cloud is more uniform and closer to the real ob-ject surface.In extensive experiments on the large synthetic dataset PU1K,the high-resolution point clouds generated by MSPUiT are reduced to 0.501 × 10-3,5.958 × 10-3 and 1.756 × 10-3 in terms of Chamfer Distance(CD),Hausdorff Distance(HD)and distance from the generated point cloud to the original point cloud block(P2F),respectively.Experimental results show that the surface of the point cloud is smoother and less noisy after upsampling by MSPUiT,and the quality of the generated point cloud is higher than that of the current mainstream point cloud upsampling networks.

外文关键词：

Transformer encoderMulti-stage learning frameworkFeature conversionPoint cloud upsamplingDeep learning

作者：

李泽锴、柏正尧、肖霄、张奕涵、尤逸琳

展开 >

作者单位：

云南大学信息学院昆明 650500

关键词：

Transformer编码器多阶段学习框架特征转换点云上采样深度学习

基金：

云南省科技重大专项

项目编号：

202002AD080001

出版年：

2024

DOI：

10.11896/jsjkx.230300154

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(6)

参考文献量33