重参数化增强的双模态实时目标检测模型

扫码查看

原文链接

万方数据
维普

中文摘要：无人机高空航拍的目标普遍尺寸小、特征弱,而且受复杂天候条件影响大,导致基于可见光或红外单模态图像的目标检测漏检、误检率较高.对此,提出了重参数化增强的双模态实时目标检测模型DM-YOLO.首先,采用通道拼接的方法融合可见光和红外图像,以极低的成本融合双模态图像的互补信息.其次,提出更加高效的重参数化模块并基于此构建了更加强大的骨干网RepCSPDarkNet,有效增强了骨干网对双模态图像的特征提取能力.然后,提出了多层次特征融合模块,通过多感受野卷积和注意力机制融合弱小目标的多尺度特征信息,增强了弱小目标的多尺度特征表示.最后,删除了对弱小目标检测基本不起作用的特征金字塔深层检测层,在检测精度保持不变的情况下,减小了模型规模.实验结果表明,在大规模的双模态图像数据集DroneVehicle上,DM-YOLO的检测精度比基准YOLOv5s高出2.45％,且优于规模相当的YOLOv6和YOLOv7模型,有效提高了复杂光照条件下目标检测的准确性和鲁棒性,同时检测速度达到82FPS,可满足实时检测的需求.

外文标题：Re-parameterization Enhanced Dual-modal Realtime Object Detection Model

外文摘要：The objects captured by drones at high altitudes are generally small and have weak features,and they are greatly affec-ted by complex weather conditions.Object detection based on visible or infrared images often has high rates of missed detection and false detection.To address this problem,this paper proposes a dual-modal realtime object detection model DM-YOLO with reparameterization enhancement.Firstly,the visible and infrared images are effectively fused by channel concatenation,which makes efficient use of the complementary information in the dual-modal images at a very low cost.Secondly,a more efficient repa-rameterization module is proposed and a more powerful backbone network RepCSPDarkNet is constructed based on it,which ef-fectively improves the feature extraction capability of the backbone network for dual-modal images.Then,a multi-level feature fu-sion module is proposed to enhance the multiscale feature representation of weak and small objects by fusing multi-scale feature information of weak and small objects with multi-receptive field dilated convolution and attention mechanism.Finally,the deep feature layer of the feature pyramid is removed,which reduces the model size while maintaining the detection accuracy.Experi-mental results on the large-scale dual-modal image dataset DroneVehicle show that,the detection accuracy of DM-YOLO is 2.45％higher than that of the baseline YOLOv5s,and is better than that of the YOLOv6 and YOLOv7 models.Furthermore,it effectively improves the accuracy and robustness of object detection under complex weather conditions,while achieving a detection speed of 82 frames per second,which can meet the requirements of realtime detection.

外文关键词：

ReparameterizationDual modalityReal-time object detectionMultiscale featuresAttention mechanism

作者：

李允臣、张睿、王家宝、李阳、王梓祺、陈瑶

展开 >

作者单位：

陆军工程大学指挥控制工程学院南京 210007

关键词：

重参数化双模态实时目标检测多尺度特征注意力机制

基金：

江苏省高校自然科学研究基金

项目编号：

BK20200581

出版年：

2024

DOI：

10.11896/jsjkx.230700106

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(9)