沈阳理工大学学报2024,Vol.43Issue(3) :18-25.DOI:10.3969/j.issn.1003-1251.2024.03.003

基于驾驶场景的高效多模态融合检测方法

Efficient Multi-modal Fusion Detection Method Based on Driving Scenes

李东宇 王绪娜 高宏伟
沈阳理工大学学报2024,Vol.43Issue(3) :18-25.DOI:10.3969/j.issn.1003-1251.2024.03.003

基于驾驶场景的高效多模态融合检测方法

Efficient Multi-modal Fusion Detection Method Based on Driving Scenes

李东宇 1王绪娜 1高宏伟1
扫码查看

作者信息

  • 1. 沈阳理工大学 自动化与电气工程学院,沈阳 110159
  • 折叠

摘要

目标检测是自动驾驶中重要的组成部分.为解决在弱光条件下单一的可见光图像不能满足实际驾驶场景检测的需求并进一步提高检测精度,提出一种用于红外和可见光融合图像的交通场景检测网络,简称AM-YOLOv5.AM-YOLOv5 中改进的Repvgg结构可以提升对融合图像特征的学习能力.此外,在主干网络末端引入自注意力机制并提出一种新的空间金字塔模块(SimSPPFCSPC)充分获取信息;为提升网络推理速度,在颈部网络的前端使用一种全新的卷积(GS卷积).实验结果表明,AM-YOLOv5 在FLIR数据集融合图像上的mAP0.5达到了69.35%,与原YOLOv5s相比,在没有牺牲推理速度的前提下,检测精度提升了1.66%.

Abstract

Target detection is an important component in autonomous driving.In order to solve the problem that a single visible image cannot meet the demand of actual driving scene detection under low light conditions and to improve the detection accuracy,a traffic scene target detection network for fused infrared and visible images is proposed,which is called AM-YOLOv5 for short.The im-proved Repvgg structure in AM-YOLOv5 can enhance the ability of learning features of fused ima-ges.In addition,a self-attention mechanism is introduced and a new spatial pyramid module(Sim-SPPFCSPC)is proposed at the end of the backbone network to obtain sufficient information.To im-prove the inference speed of the network,a new convolution(GS convolution)is used at the front of the neck.Experimental results show that AM-YOLOv5 achieves 69.35%mAP0.5 on the fusion im-age of FLIR dataset,the detection accuracy is improved by 1.66%compared with the original YOLOv5s,without any sacrifice in inference speed.

关键词

目标检测/多模态融合/驾驶场景/融合图像

Key words

object detection/multi-modal fusion/driving scenes/image fusion

引用本文复制引用

基金项目

辽宁省重点科技创新基地联合开放基金项目(2021-KF-12-05)

出版年

2024
沈阳理工大学学报
沈阳理工大学

沈阳理工大学学报

影响因子:0.223
ISSN:1003-1251
参考文献量16
段落导航相关论文