融合图像信息的跨模态Transformer点云补全算法

Cross-Modal Transformer for Point Cloud Completion

何星 ¹朱哲 ¹燕雪峰 ¹郭延文 ²宫丽娜 ¹魏明强¹

扫码查看

作者信息

1. 南京航空航天大学计算机科学与技术学院南京 210016
2. 南京大学计算机软件新技术国家重点实验室南京 210023
折叠

摘要

针对三维传感器(如LiDAR、深度相机)获取的点云往往残缺不全,需要进行补全处理,而单模态方法存在的补全结果细节不丰富、结构不完整等问题,提出一种融合图像信息的跨模态Transformer点云补全算法.首先采用点云分支和图像分支分别提取点云特征和图像特征,其中,点云分支采用PoinTr为骨干网络,图像分支采用7层卷积;然后通过特征融合模块融合点云特征和图像特征,由粗到精地生成全分辨率的点云.在ShapeNet-ViPC数据集上进行实验的结果表明,所提算法的可视化结果优于单模态点云补全方法和目前仅有的跨模态点云补全方法ViPC,且在大部分测试类别上的CD-L2量化指标优于ViPC;平均CD-L2为2.74,比ViPC低17％.为了便于研究人员评估和使用,文中算法可通过https://github.com/Starak-x/ImPoinTr开源获取.

Abstract

The point cloud obtained by 3D sensors(such as LiDAR and depth camera)is mostly incomplete and needs to be completed.Aiming at the problems of insufficient details and incomplete structure of sin-gle-modal point cloud completion methods,a cross-modal Transformer for point cloud completion is pro-posed.Point cloud features and image features are extracted by point cloud branch and image branch respec-tively.Point cloud branch adopts PoinTr as backbone,and image branch adopts 7 convolution layers.The feature fusion module fuses point cloud features and image features together to generate a full resolution point cloud in a coarse-to-fine manner.Experimental results indicate that the visualization of this method is better than the single-modal point cloud completion methods and the cross-modal point cloud completion method ViPC.Moreover,the CD-L2 of this method is better than ViPC on most categories,and the average CD-L2 is 2.74,which is 17％lower than ViPC.Our code is available at:https://github.com/Starak-x/ImPoinTr.

关键词

点云补全/Transformer/跨模态

Key words

point cloud completion/Transformer/cross modality

引用本文复制引用

基金项目

国家自然科学基金(T2322012)

国家自然科学基金(62172217)

国家自然科学基金(62032011)

出版年

2024

计算机辅助设计与图形学学报

中国计算机学会

计算机辅助设计与图形学学报

CSTPCDCSCD北大核心

影响因子：0.892

ISSN：1003-9775

参考文献量1

段落导航