语义引导下的快速二阶段三维目标检测

Fast Two-Stage 3D Object Detection with Semantic Guidance

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：随着激光雷达采样率不断提高,系统可以更快速地获取高分辨率的场景点云数据.密集的点云有利于提高三维目标检测的精度,但也增大了计算的负担,基于点表征的三维目标检测方法面临着如何平衡速度和精度的挑战.为提高三维目标检测多级下采样的计算效率,解决前景点召回率低和一阶段网络尺寸歧义的问题,提出基于语义引导的快速二阶段方法.在第一阶段引入语义引导下采样方法,使深度神经网络能够更高效地感知前景点.在第二阶段引入通道感知池化方法,通过添加池化点来聚合采样点的语义信息,丰富感兴趣区域特征描述,获得更准确的建议框.在KITTI数据集上的测试结果显示:相比同类型的二阶段基线方法,所提方法对汽车、行人、骑行者类别的检测准确率最高提升了4.62百分点、1.44百分点、3.91百分点;另外,推理速度达55.6 frame/s,超出最快的基准线31.1%.算法在精度和速度方面都有着良好的表现,具有一定的实际应用价值.

外文摘要：With the continuous increase in the sampling rate of LiDAR,systems can rapidly acquire high-resolution point cloud data of scenes.Dense point clouds are advantageous for improving the accuracy of 3D object detection;however,they increase the computational load.In addition,point-based 3D object detection methods encounter the challenges of balancing speed and accuracy.To enhance the computational efficiency of multilevel downsampling in 3D object detection and address issues such as low foreground point recall rate and size ambiguity of the one-stage network,a fast two-stage method based on semantic guidance is proposed herein.In the first stage,a semantic-guided downsampling method is introduced to enable deep neural networks to efficiently perceive foreground points.In the second stage,a channel-aware pooling method is employed to aggregate semantic information of the sampled points by adding pooled points,thereby enrich the feature description of regions of interest,and obtain more accurate proposal boxes.Test results on the KITTI dataset reveal that compared with similar two-stage baseline methods,the proposed method achieves the highest detection-accuracy improvements of 4.62 percentage points,1.44 percentage points,and 3.91 percentage points for cars,pedestrians,and cyclists,respectively.Furthermore,the inference speed reaches 55.6 frame/s,surpassing the fastest benchmark by 31.1%.The algorithm exhibits strong performance in accuracy and speed,holding practical value for real-world applications.

外文关键词：

point cloudsemantic-guided downsamplingchannel-aware pooling3D object detection

作者：

黄莽、惠斌、刘兆吉、金天明

展开 >

作者单位：

中国科学院光电信息处理重点实验室,辽宁沈阳 110016

中国科学院沈阳自动化研究所,辽宁沈阳 110016

中国科学院机器人与智能制造创新研究院,辽宁沈阳 110169

中国科学院大学,北京 100049

展开 >

关键词：

点云语义引导下采样通道感知池化三维目标检测

基金：

军科委技术领域基金项目

项目编号：

E01Z041101

出版年：

2024

DOI：

10.3788/LOP231763

激光与光电子学进展

中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCD北大核心

影响因子：1.153

ISSN：1006-4125

年,卷(期)：2024.61(12)

参考文献量6