Road Damage Detection in Large UAV Images Using a Multiscale Fusion Strategy and Improved YOLOV5
The use of Unmanned Aerial Vehicles (UAVs) for road image collection is advantageous owing to their large scope and cost-effectiveness. However,the size and shape of road damages vary significantly,making them challenging to predict. Furthermore,due to the limitations of computational resources,generalized target detection algorithms are only applicable to small-size images (512 pixels× 512 pixels or 640 pixels× 640 pixels). This makes them unsuitable for direct application to large-size UAV images (5472 pixels× 3648 pixels or 7952 pixels × 5304 pixels). The utilization of traditional methods for the detection of multi-scale targets in large-size images is associated with a number of issues,including the slicing of large-size targets and the failure to detect small-size targets. To address these challenges,this paper presents an innovative solution that combines the global-local multiscale fusion strategy with YOLOv5-RDD. First,a YOLOv5-RDD model is constructed,and based on the existing YOLOv5 model,a multiscale C3 (MSC3) module and a Contextual Feature Pyramid Network (CFPN) are designed to improve the detection capability of multiscale targets. Additionally,we introduce an extra detection head for larger-size targets. Then,a global-local multiscale fusion strategy is proposed,which uses resizing and slicing means to obtain global and local information of large UAV images,and then superimposes the global and local multiscale information to obtain the multi-scale information of the whole large image. The detection results are optimized using the center non-maximum value suppression algorithm. Specifically,the global-local multiscale fusion strategy first trains the YOLOv5-RDD using multiscale training strategy to learn complete multiscale features. Then,YOLOv5-RDD predicts multiscale road damages in large-size images using a multiscale prediction strategy to avoid directly applying it to these images. Finally,we use center non-maximum suppression to eliminate redundant object detection boxes. To verify the effectiveness of the proposed method and meet real-world requirements,a UAV-RDD dataset specialized for UAV road disease detection is created. The experimental results show that compared with the original YOLOv5 model,the new model YOLOv5-RDD improves the mAP by 5.8%,while the global-local multiscale fusion strategy improves the mAP by 9.73% compared with the traditional method. The MSC3 achieves the maximum enhancement of mAP@0.5,with an improvement of 2.6%,contributing only 0.8 M parameters. The CFPN yields an improvement of 0.2% in mAP@0.5 while reducing the number of parameters by 8 M. These results fully prove the effectiveness and superiority of the method in this paper.