首页|面向无人机监控的动态多尺度目标检测模型的研究与实现

面向无人机监控的动态多尺度目标检测模型的研究与实现

扫码查看
在无人机侦察、安防监控以及自动驾驶等领域中,目标检测技术面临巨大的挑战,图像中的目标往往具有多尺度属性,尤其是小尺寸目标检测难,以及目标很容易受到不同程度的遮挡.针对这些亟待解决的问题,本文提出了一种创新的动态多尺度目标检测模型:YOLO-DDE.首先,本文了提出了CEMA和CED卷积模块,增强了骨干网络对多尺度信息的处理能力精细特征提取能力,从而实现在复杂场景下更加精确的识别效果.此外,本文通过对FPAN网络结构进行创新性重构,提出了DFPN结构,此结构采用纵向跨尺度融合技术,显著提升了模型的尺度特征融合效果.最后,引入了动态检测头,提出了DD-Head结构,强化了模型对下游任务处理的能力.综上所述,本文提出的YOLO-DDE模型以其动态多尺度结构,为目标检测技术的性能提升提供了新的可能性.本文在PASCAL VOC数据集上进行了消融实验和对比试验,与当前主流先进模型YOLOv8相比,本文模型YOLO-DDE在评价指标map50和map50-95上分别提升了1.8%和3.2%,并且本文还在VisDrone、HIT-UAV、FAIR1M2.0数据集上进行了泛化性实验,验证了模型具有很强的泛化能力.
Research and implementation of dynamic multi-scale target detection model for UAV surveillance
In the fields of UAV reconnaissance,security monitoring,and autonomous driving,target detection technology faces significant challenges. Targets in images often exhibit multi-scale attributes,making detection of small-sized targets particularly difficult,and targets are prone to various degrees of occlusion. To address these pressing issues,this paper proposes an innovative dynamic multi-scale target detection model:YOLO-DDE. Firstly,novel CEMA and CED convolutional modules are introduced to enhance the backbone network's ability to handle multi-scale information and extract fine features,thus achieving more precise recognition in complex scenes. Additionally,the FPAN network structure is innovatively restructured into the DFPN structure,which employs longitudinal cross-scale fusion technology to significantly improve the model's scale feature fusion effect.Finally,a dynamic detection head is introduced,proposing the DD-Head structure,which strengthens the model's ability to handle downstream tasks. In summary,the proposed YOLO-DDE model,with its dynamic multi-scale structure,provides new possibilities for improving target detection technology performance.Experiments on the PASCAL VOC dataset were conducted to validate the proposed model. Compared to the current state-of-the-art model YOLOv8,the YOLO-DDE model achieves a 1.8% and 3.2% improvement in evaluation metrics map50 and map50-95,respectively. Furthermore,generalization experiments on the VisDrone,HIT-UAV,and FAIR1M2.0 datasets validate the model's strong generalization ability.

attention mechanismmulti-scaledecoupled headdeformable convolutionDFPN

张宇、王延吉、马辉、闫锴、李大舟

展开 >

沈阳化工大学计算机科学与技术学院 沈阳 110142

辽宁省化工过程工业智能化技术重点实验室 沈阳 110142

沈阳化工大学网络与信息化中心 沈阳 110142

沈阳科技学院信息与控制工程系 沈阳 110167

展开 >

注意力机制 多尺度 解耦头 可变形卷积 DFPN

辽宁省教育厅科学研究项目

LJKZ0449

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(10)
  • 2