面向自动驾驶的多模态信息融合动态目标识别

Multimodal information fusion dynamic target recognition for autonomous driving

张明容 ¹喻皓 ²吕辉 ³姜立标 ³李利平 ³卢磊⁴

扫码查看

作者信息

1. 广东轻工职业技术学院汽车技术学院,广州 510000
2. 广汽埃安新能源汽车股份有限公司研发中心,广州 511400
3. 华南理工大学机械与汽车工程学院,广州 510641
4. 广州城市理工学院工程研究院,广州 510800
折叠

摘要

研究提出一种面向自动驾驶的多模态信息融合的目标识别方法,旨在解决自动驾驶环境下车辆和行人检测问题.该方法首先对ResNet50网络进行改进,引入基于空间注意力机制和混合空洞卷积,通过选择核卷积替换部分卷积层,使网络能够根据特征尺寸动态调整感受野的大小;然后,卷积层中使用锯齿状混合空洞卷积,捕获多尺度上下文信息,提高网络特征提取能力.改用GIoU损失函数替代YOLOv3中的定位损失函数,GIoU损失函数在实际应用中具有较好操作性;最后,提出了基于数据融合的人车目标分类识别算法,有效提高目标检测的准确率.实验结果表明,该方法与OFTNet、VoxelNet 和FasterRCNN网络相比,在mAP指标白天提升幅度最高可达0.05,晚上可达0.09,收敛效果好.

Abstract

A multi-modal information fusion based object recognition method for autonomous driving is proposed to address the vehicle and pedestrian detection challenge in autonomous driving environments.The method first improves ResNet50 network based on spatial attention mechanism and hybrid null convolution.The standard convolution is replaced by selective kernel convolution,which allows the network to dynamically adjust the size of the perceptual field according to the feature size.Then,the sawtooth hybrid null convolution is used to enable the network to capture multi-scale contextual information and improve the network feature extraction capability.The localization loss function in YOLOv3 is replaced with the GIoU loss function,which has better operability in practical applications.Finally,human-vehicle target classification and recognition algorithm based on two kinds of data fusion is proposed,which can improve the accuracy of the target detection.Experimental results show that compared with OFTNet,VoxelNet and FASTERRCNN,the mAP index can be improved by 0.05 during daytime and 0.09 in the evening,and the convergence effect is good.

关键词

自动驾驶/ResNet50/YOLOv3/数据融合/注意力机制/损失函数

Key words

autonomous driving/ResNet50/YOLOv3/data fusion/attention mechanism/loss function

引用本文复制引用

基金项目

国家自然科学基金(51975217)

出版年

2024

重庆大学学报

重庆大学

重庆大学学报

CSTPCDCSCD北大核心

影响因子：0.601

ISSN：1000-582X

参考文献量20

段落导航