点云数据的高度稀疏特性使当前大部分基于点云的三维目标检测算法对点云的局部特征学习不足,且点云数据包含的部分无效信息会干扰目标检测.针对以上问题,提出了一种基于边卷积与瓶颈注意力的三维目标检测模型.首先,构建多层边卷积(Edge Convolution,EdgeConv),针对点云中的每个点,通过寻找特征空间上与其最接近的K个点,以构建K-近邻图结构,并学习点云的多尺度局部特征;其次,设计适用于三维点云数据的瓶颈注意力模块(Bottleneck Attention Module,BAM),每个BAM包括一个通道注意力模块和一个空间注意力模块,用于增强对目标检测有价值的点云信息,提升网络模型的表征能力.网络以VoteNet为基线,多层边卷积和BAM模块依次加入PointNet++网络和投票模块之间.模型在SUN RGB-D和ScanNetV2公共数据集上进行实验,并与13个当前先进的三维目标检测算法进行对比.实验结果表明,对于SUN RGB-D数据集,所提模型在交并比(Intersection over Union,IoU)为0.5时的平均精确率mAP@0.5达到了最高,并在床、椅子、办公桌等6个对象类别(共10个类别)达到最优准确率(AP@0.25);对于ScanNetV2数据集,模型的mAP@0.25和mAP@0.5均达到最优,并在椅子、沙发、照片等10个对象类别(共18个类别)达到了最优准确率(AP@0.25).与基线VoteNet相比,所提模型在两个数据集上的mAP@0.25分别提升了 6.5%和12.9%,消融实验证明了所加入的边卷积模块和瓶颈注意力模块的有效性.
3D Object Detection Based on Edge Convolution and Bottleneck Attention Module for Point Cloud
Due to the highly sparsity of point cloud data,current 3D object detection methods based on point cloud are inadequate for learning local features,and some invalid information contained in point cloud data can interfere with object detection.To ad-dress the above problems,a 3D object detection model based on edge convolution(EdgeConv)and bottleneck attention module(BAM)is proposed.First,by creating a K-nearest-neighbor graph structure for each point in point clouds on the feature space,multilayer edge convolutions are constructed to learn the multi-scale local features of point clouds.Second,a bottleneck attention module(BAM)is designed for 3D point cloud data,and each BAM consists of a channel attention module and a spatial attention module to enhance the point cloud information that is valuable for object detection,aiming to strengthen the feature representation of the proposed model.The network uses VoteNet as the baseline,and multilayer edge convolutions and BAM are added sequen-tially between PointNet++and the voting module.The proposed model is evaluated and compared with other 13 state-of-the-art methods on two benchmark datasets SUN RGB-D and ScanNetV2.Experimental results demonstrate that on SUN RGB-D data-set,the proposed model achieves the highest mAP@0.5,and the highest AP@0.25 for six out of ten categories such as bed,chair and desk.On ScanNetV2 dataset,this model outperforms other 13 methods in terms of mAP under both IoU 0.25 and 0.5,and achieves the highest AP@0.25 for ten out of eighteen categories such as chair,sofa and picture.As compared to the baseline Vo-teNet,the mAP@0.25 of the proposed model improves by 6.5%and 12.9%respectively on two datasets.Ablation studies are conducted to verify the contributions of each component.
3D object detectionPoint cloudsEdge convolutionBottleneck attention moduleVoteNetSUN RGB-D datasetScanNetV2 dataset
简英杰、杨文霞、方玺、韩欢
展开 >
武汉理工大学理学院 武汉 430070
三维目标检测 点云 边卷积 瓶颈注意力模块 VoteNet SUN RGB-D数据集 ScanNetV2数据集