双注意力机制的多模数据融合的三维目标检测

A 3D object detection based on multi-modal data fusion with dual attention mechanism

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对点云稀疏性和单模态数据信息不足导致远小物体检测困难的问题,提出了一种双注意力机制的多模态数据融合的三维目标检测算法.首先,设计了一种体素多邻域特征提取器,扩大体素感受野,融合体素多个上下文信息,以提高体素特征对物体空间结构和语义信息的表征能力及提高特征鲁棒性;其次,提取了体素的图片多层语义特征,底层结构特征和高层语义特征分别保留目标位置信息和语义信息,以增强体素特征;最后,设计了一种多模态特征融合,使用通道注意力自适应融合不同模态特征,使用体素注意力增强有效目标物体特征表达、抑制无用背景物体特征表达.在 KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)数据集上的实验结果表明,在远小物体检测上,所提方法较现有多个主流单模态方法和多模态方法取得了较大的性能提升.

外文摘要：Aiming at the difficulty of detecting far and small objects due to the sparseness of point cloud and insufficient information of single-modal data,a dual attention-based multi-modal fusion net(DAMFNet)algorithm for 3D target detection is proposed.Firstly,a voxel multi-neighborhood feature extractor is designed to expand the voxel receptive field and fuse multiple context information of voxels,so as to improve the ability of voxel features to represent the spatial structure and semantic information of objects,and improve feature robustness.Secondly,the multi-layer semantic features of voxel images are extracted,and the target location information and semantic information are retained in the bottom structural features and high-level semantic features respectively to enhance the voxel features.Finally,a multi-modal feature fusion is designed,which uses channel attention to adaptively fuse different modal features,and uses voxel attention to enhance the feature expression of effective target objects and suppress the feature expression of useless background objects.The experimental results on the KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)dataset show that,in the detection of far and small objects,the proposed method has achieved a greater performance improvement than many mainstream single-modal and multi-modal methods.

外文关键词：

object detectionmulti-modal fusionattention mechanismmulti-neighborhood features

作者：

李阳、葛洪伟

展开 >

作者单位：

江南大学人工智能与计算机学院,江苏无锡 214122

江南大学江苏省模式识别与计算智能实验室,江苏无锡 214122

关键词：

目标检测多模态融合注意力机制多邻域特征

基金：

国家自然科学基金江苏省研究生创新项目江苏省高等学校优势学科建设工程项目

项目编号：

61806006KYLX16_0781

出版年：

2024

DOI：

10.14188/j.1671-8844.2024-08-017

武汉大学学报(工学版)

武汉大学

武汉大学学报(工学版)

CSTPCD北大核心

影响因子：0.621

ISSN：1671-8844

年,卷(期)：2024.57(8)