首页|基于多尺度特征融合与交互的路侧目标检测算法

基于多尺度特征融合与交互的路侧目标检测算法

扫码查看
针对路侧视角下目标检测任务中,小目标密集,多尺度变化以及复杂天气背景干扰等挑战,提出基于多尺度特征融合与交互的目标检测算法——MF-YOLO.设计C2f-CAST,通过星操作将来自不同子空间的特征进行交互与变换,并引入 MLCA捕捉远距离像素之间局部,全局特征以及通道和空间特征,多尺度信息聚合加强对遮挡目标显著语义信息关注,消除背景影响;针对颈部层在上下文信息融合效率较低的问题,加入轻量级卷积GSConv对传统卷积进行优化,并设计跨级部分网络模块,降低模型复杂度和参数量.构造跨层级融合模块SDFM,对浅层特征图进行自校准操作,并融合深层特征图语义信息,解决小目标漏检的问题;最后,设计基于自适应惩罚因子和锚框质量的梯度调整函数,并结合动态聚集机制改进的 WPIoU损失函数,提升边界框回归性能和检测鲁棒性.实验结果显示,MF-YOLO在DAIR-V2X-I和UA-DETRAC数据集上mAP@0.5指标分别达到85.1%,92.3%,与原YOLOv8s相比分别提升4.4%和1.8%,计算量GFLOPs下降了19.8%,参数量下降8.18%.检测速度达到152 fps,满足实时要求.
Roadside object detection algorithm with multi-scale feature fusion and interaction
In view of the challenges of dense small targets,multi-scale variations,and complex weather background interference in roadside perspective target detection tasks,a multi-scale feature fusion and interaction-based target detection algorithm,MF-YOLO,is proposed.Design C2f-CAST,interact and transform features from different subspaces through star operations,and introduce MLCA to capture local,global,channel,and spatial features between distant pixels.Multi-scale information aggregation enhances attention to significant semantic information of occluded objects and eliminates background influence;to address the problem of low efficiency in context information fusion for the neck layer,we add lightweight convolution GSConv to optimize traditional convolution,and design a cross-level partial network module VoV-GSCSP to reduce model complexity and parameter count.Construct a cross-level fusion module SDFM to perform self-calibration on shallow feature maps and fuse semantic information from deep feature maps to solve the problem of missed detection of small targets;finally,the design is based on an adaptive penalty factor,a gradient adjustment function for anchor box quality combined with a dynamic clustering mechanism to improve the WPIoU loss function,enhancing the performance of bounding box regression and detection robustness.The experimental results show that MF-YOLO achieves mAP@0.5 of 85.1%and 92.3%on DAIR-V2X-I and UA-DETRAC datasets,respectively,which is 4.4%and 1.8%higher than the original YOLOv8s,with a reduction of 19.8%in computational complexity and 8.18%in parameter count.The detection speed reaches 152 fps,meeting the real-time requirements.

roadside imagesstar operationfeature fusionobject detectionattention mechanism

顾杨海、李富、陈德基、王泉

展开 >

南京信息工程大学计算机学院 南京 210044

同济大学嵌入式系统与服务计算教育部重点实验室 上海 201804

无锡学院物联网工程学院 无锡 214105

路侧图像 星操作 特征融合 目标检测 注意力机制

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(23)