图像-雷达融合的三维目标检测算法

扫码查看

原文链接

万方数据
维普

中文摘要：针对多模态信息在三维空间融合过程中数据一致性和有效性的问题,提出鸟瞰视角(BEV)下图像与雷达融合的编码模块BEVIRF.与传统的透视视角下深度信息缺失的方案相比,本方法利用可变注意力的改进方案聚合图像和雷达信息,解决不同模态信息的统一表示问题,生成语义丰富且包含空间位置信息BEV特征图.同时在基于Transformer的网络结构中提出动态位置编码,旨在通过感知物体的空间信息来生成对应的位置编码,让模型专注于目标的回归,减少查询与匹配的不稳定性.所提出的方案在nuScenes数据集上取得了优秀结果.

外文标题：Image-radar fusion algorithm for 3D object detection

外文摘要：To address the challenges of data consistency and validity in the fusion of multimodal infor-mation within three-dimensional space,the proposed coding module BEVIRF for image-radar fusion in bird's eye view(BEV).Compared with the traditional scheme that lacks depth information in perspec-tive view,this method uses an improved scheme of variable attention to aggregate image and radar information,solves the problem of the unified representation of different modal information,and gener-ates semantically rich BEV feature maps that contain spatial location information BEV feature map.Meanwhile,the dynamic position encoding is proposed in the Transformer-based network structure,which aims to generate the corresponding position encoding by sensing the spatial information of the object,so that the model can focus on the target regression and reduce the instability of querying and matching.The proposed methods achieve impressive results on the nuScenes dataset compared to the state-of-the-art.

外文关键词：

three dimension object detectionconvolutional neural networkattention mechanismbird's eye view features

作者：

蔡甘霖、陈锋、张森林

展开 >

作者单位：

福州大学物理与信息工程学院,福建福州 350108

关键词：

三维目标检测卷积神经网络注意力机制 BEV特征

出版年：

2024

DOI：

10.7631/issn.1000-2243.23308

福州大学学报(自然科学版)

福州大学

福州大学学报(自然科学版)

CSTPCD北大核心

影响因子：0.35

ISSN：1000-2243

年,卷(期)：2024.52(6)