首页|3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法

3D稀疏卷积结构下融合空间点与体素关系建模的LiDAR点云跟踪方法

扫码查看
稀疏卷积在处理激光雷达点云单目标跟踪时的潜力尚未得到充分发掘.目前,绝大多数点云跟踪算法使用基于球邻域的骨干网络,其显存计算资源占用大并且目标感知的关系建模不充分.针对此问题,本文提出一种基于稀疏卷积结构的LiDAR(Lightlaser Detection And Ranging)点云跟踪算法,并创新性地融合了空间点与体素双通道的关系建模模块,以高效适应稀疏框架下目标判别信息的嵌入.首先,本文采用3D稀疏卷积残差网络来分别提取模板和搜索区域的特征,并利用反卷积来获取逐点特征来保证跟踪任务中对空间位置特性的要求.其次,关系建模模块进一步在模板与搜索区域特征之间计算相似度语义查询表.为了捕捉到模板与搜索区域间细粒度的关联性,该模块一方面在空间点通道中利用近邻算法找出每个搜索区域点的模板近邻点,并根据语义查询表提取对应特征;另一方面,在体素通道中以每个搜索区域点为中心构建局部多尺度体素,并根据落入体素单元的模板点索引计算语义查询表中值的累计和.最后,将双通道的特征融合并送入基于鸟瞰图的候选包围盒生成模块来回归目标包围盒.为了验证所提出方法的优越性,本文在KITTI和NuScenes数据集进行了测试,对比其他使用稀疏卷积的算法,本文方法平均成功率和精确率分别提升了11.0%和12.0%.本文方法在继承了稀疏卷积高效特点的同时还实现了跟踪精度的提高.
LiDAR Point Cloud Tracking Method Using Point-Voxel Relationship Modeling Under 3D Sparse Convolutional Framework
The potential of sparse convolution in the field of single target tracking from LiDAR(Lightlaser Detection And Ranging)point cloud has not been fully explored.The vast majority of point cloud tracking algorithms use point-based backbone networks which require higher computation costs and the target-aware relationship modeling is insufficient.To ad-dress this problem,this paper proposes a 3D target tracking algorithm based on a sparse convolutional framework,and incor-porates it with a point-voxel dual channel relationship modeling module to facilitate the embedding of target discrimination information in the such sparse framework.Firstly,this work uses a 3D convolutional residual network to extract the features of the template and search area separately,then uses deconvolution to obtain pointwise features for the spatial position in tracking tasks.Secondly,the relationship modeling module further calculates a semantic similarity query table based on the above features of the template and the search area.In order to capture the fine-grained correlation,on the one hand,the mod-ule utilizes the nearest neighbor algorithm in the spatial point channel to find the template points for each search area point,and extracts corresponding features based on the query table;on the other hand,local multi-scale voxels are constructed with each search area point as the center in the voxel channel,and the accumulated similarity of templates falling into voxel units is used as clues to extract features.Finally,the dual channel feature fusion is sent into the candidate bounding box gen-eration module based on bird's-eye view to estimate the target bounding box.To verify the superiority of the proposed meth-od,we evaluated it on the KITTI and NuScenes datasets,and compared with the baseline algorithm adopting sparse convolu-tion,the mean success and precision rates achieved a considerable improvement of 11.0%and 12.0%.The proposed method not only inherits the efficient characteristics of sparse convolution but also improves tracking accuracy.

point cloud understandingobject trackingmachine visionsparse convolutionfeature fusion

田胜景、韩一男、赵宪通、刘秀平、张明

展开 >

中国矿业大学经济管理学院,江苏 徐州 221116

大连理工大学白俄罗斯国立大学联合学院,辽宁 大连 116024

大连理工大学数学科学学院,辽宁 大连 116024

点云理解 目标跟踪 机器视觉 稀疏卷积 特征融合

国家自然科学基金中国博士后科学基金中央高校基本科研业务费专项资金资助

623015622023M7337562023QN1055

2024

电子学报
中国电子学会

电子学报

CSTPCD北大核心
影响因子:1.237
ISSN:0372-2112
年,卷(期):2024.52(10)