应用多尺度融合策略和改进YOLOV5的道路病害无人机检测

Road Damage Detection in Large UAV Images Using a Multiscale Fusion Strategy and Improved YOLOV5

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：结合无人机和深度学习目标检测算法自动检测道路病害具有范围广、成本效益高等优势.然而,道路病害的形状和大小变化剧烈,很难完整检测它们.此外,受限于计算资源,通用的目标检测算法只适用于小尺寸影像(512像素×512像素或640像素×640像素),很难直接应用于大尺寸的无人机影像(5472像素×3648像素或7952像素×5304像素).使用传统方法检测大尺寸影像中的多尺度目标会出现大尺寸目标切分、小尺寸目标漏检等问题.针对上述问题,本文提出了一种结合全局-局部多尺度融合策略和YOLOv5-RDD的创新解决方案.①构建了YOLOv5-RDD模型,在现有YOLOv5模型的基础上,设计多尺度C3(MSC3)模块和上下文特征金字塔网络(CFPN),增强了对多尺度目标的检测能力.②提出了一种全局-局部多尺度融合策略,利用下采样和切分手段获取大尺寸无人机影像的全局和局部信息,然后叠加全局和局部多尺度信息以获取整个大尺寸影像的多尺度信息,并采用中心非极大值抑制算法优化检测结果.③为验证所提方法的有效性,创建了一个专门用于无人机道路病害检测的UAV-RDD数据集.实验结果显示,与原始的YOLOv5模型相比,新模型YOLOv5-RDD在mAP上提升了5.8％,而全局-局部多尺度融合策略相比传统方法在mAP上提升了9.73％,充分证明了本文方法的有效性和优越性.

外文摘要：The use of Unmanned Aerial Vehicles (UAVs) for road image collection is advantageous owing to their large scope and cost-effectiveness. However,the size and shape of road damages vary significantly,making them challenging to predict. Furthermore,due to the limitations of computational resources,generalized target detection algorithms are only applicable to small-size images (512 pixels× 512 pixels or 640 pixels× 640 pixels). This makes them unsuitable for direct application to large-size UAV images (5472 pixels× 3648 pixels or 7952 pixels × 5304 pixels). The utilization of traditional methods for the detection of multi-scale targets in large-size images is associated with a number of issues,including the slicing of large-size targets and the failure to detect small-size targets. To address these challenges,this paper presents an innovative solution that combines the global-local multiscale fusion strategy with YOLOv5-RDD. First,a YOLOv5-RDD model is constructed,and based on the existing YOLOv5 model,a multiscale C3 (MSC3) module and a Contextual Feature Pyramid Network (CFPN) are designed to improve the detection capability of multiscale targets. Additionally,we introduce an extra detection head for larger-size targets. Then,a global-local multiscale fusion strategy is proposed,which uses resizing and slicing means to obtain global and local information of large UAV images,and then superimposes the global and local multiscale information to obtain the multi-scale information of the whole large image. The detection results are optimized using the center non-maximum value suppression algorithm. Specifically,the global-local multiscale fusion strategy first trains the YOLOv5-RDD using multiscale training strategy to learn complete multiscale features. Then,YOLOv5-RDD predicts multiscale road damages in large-size images using a multiscale prediction strategy to avoid directly applying it to these images. Finally,we use center non-maximum suppression to eliminate redundant object detection boxes. To verify the effectiveness of the proposed method and meet real-world requirements,a UAV-RDD dataset specialized for UAV road disease detection is created. The experimental results show that compared with the original YOLOv5 model,the new model YOLOv5-RDD improves the mAP by 5.8％,while the global-local multiscale fusion strategy improves the mAP by 9.73％ compared with the traditional method. The MSC3 achieves the maximum enhancement of mAP@0.5,with an improvement of 2.6％,contributing only 0.8 M parameters. The CFPN yields an improvement of 0.2％ in mAP@0.5 while reducing the number of parameters by 8 M. These results fully prove the effectiveness and superiority of the method in this paper.

外文关键词：

road damage detectionYOLOv5Unmanned Aerial Vehicle (UAV)object detectionlarge-size imagemulti-scale feature fusionnon-maximum suppression

作者：

程传祥、金飞、林雨准、王淑香、左溪冰、李军杰、苏凯阳

展开 >

作者单位：

信息工程大学,郑州 450001

河南城建学院,平顶山 467036

平顶山学院,平顶山 467000

关键词：

道路病害检测 YOLOv5 无人机影像目标检测大尺寸影像多尺度特征融合非极大值抑制

出版年：

2024

DOI：

10.12082/dqxxkx.2024.240147

地球信息科学学报

中国科学院地理科学与资源研究所

地球信息科学学报

CSTPCD北大核心

影响因子：1.004

ISSN：1560-8999

年,卷(期)：2024.26(8)