基于驾驶场景的高效多模态融合检测方法
Efficient Multi-modal Fusion Detection Method Based on Driving Scenes
李东宇 1王绪娜 1高宏伟1
作者信息
- 1. 沈阳理工大学 自动化与电气工程学院,沈阳 110159
- 折叠
摘要
目标检测是自动驾驶中重要的组成部分.为解决在弱光条件下单一的可见光图像不能满足实际驾驶场景检测的需求并进一步提高检测精度,提出一种用于红外和可见光融合图像的交通场景检测网络,简称AM-YOLOv5.AM-YOLOv5 中改进的Repvgg结构可以提升对融合图像特征的学习能力.此外,在主干网络末端引入自注意力机制并提出一种新的空间金字塔模块(SimSPPFCSPC)充分获取信息;为提升网络推理速度,在颈部网络的前端使用一种全新的卷积(GS卷积).实验结果表明,AM-YOLOv5 在FLIR数据集融合图像上的mAP0.5达到了69.35%,与原YOLOv5s相比,在没有牺牲推理速度的前提下,检测精度提升了1.66%.
Abstract
Target detection is an important component in autonomous driving.In order to solve the problem that a single visible image cannot meet the demand of actual driving scene detection under low light conditions and to improve the detection accuracy,a traffic scene target detection network for fused infrared and visible images is proposed,which is called AM-YOLOv5 for short.The im-proved Repvgg structure in AM-YOLOv5 can enhance the ability of learning features of fused ima-ges.In addition,a self-attention mechanism is introduced and a new spatial pyramid module(Sim-SPPFCSPC)is proposed at the end of the backbone network to obtain sufficient information.To im-prove the inference speed of the network,a new convolution(GS convolution)is used at the front of the neck.Experimental results show that AM-YOLOv5 achieves 69.35%mAP0.5 on the fusion im-age of FLIR dataset,the detection accuracy is improved by 1.66%compared with the original YOLOv5s,without any sacrifice in inference speed.
关键词
目标检测/多模态融合/驾驶场景/融合图像Key words
object detection/multi-modal fusion/driving scenes/image fusion引用本文复制引用
基金项目
辽宁省重点科技创新基地联合开放基金项目(2021-KF-12-05)
出版年
2024