针对无人机视角下的目标存在多尺度、目标小、被遮挡与背景复杂等问题,提出了一种基于动态样本注意力尺度序列的YOLOv8改进算法BDAD-YOLO。通过引入BiFormer的思想来改造原模型骨干结构,提高模型对关键信息的关注度,更好地保留目标细粒度细节信息。由于目标存在大小、位置等多变性,传统卷积并不能很好地处理这一情况,因此基于DCN(deformable convolutional network)的思想,设计了一种可以增强对小目标特征提取的C2 DCf模块,从而进一步提高颈部网络中小目标层对特征信息的融合。提出一种基于动态样本的注意力尺度序列融合框架 AFD(attention-scale sequence fusion framework based on dynamic samples),使用轻量化动态点采样并通过融合不同尺度的特征图来增强网络提取多尺度信息的能力。使用WIoU损失函数,改善小目标低质量数据对梯度的不利影响,以加快网络收敛速度。实验结果表明,在VisDrone数据集中的val集与test集上平均精度(mAP@0。5)分别提升了 4。6个百分点、3。7个百分点,在DOTA数据集上平均精度(mAP@0。5)提升了 2。4个百分点,证明了改进算法的有效性和普适性。
Optimized and Improved YOLOv8 Target Detection Algorithm from UAV Perspective
Aiming at the problems of multi-scale,small target,complex background and target occlusion in unmanned aerial vehicle(UAV)view,an improved YOLOv8 algorithm BDAD-YOLO based on dynamic sample attention scale sequences is proposed.Firstly,by introducing the idea of BiFormer,the backbone structure of the original model is reformed to improve the model's attention to key information and better retain the fine-grained details of the target.Because of the variability of the size and position of the target,the traditional convolution can't handle this situation well.Therefore,based on the idea of deformable convolutional network(DCN),a C2_DCf module is designed,which can enhance the feature extraction of small targets,so as to further improve the fusion of feature information between small and medium-sized target layers in the neck network.Secondly,an attentional scale sequence fusion framework based on dynamic samples is proposed,which uses lightweight dynamic point sampling and fuses feature maps of different scales to both enhance the ability of the network and extract multi-scale information.Finally,WIoU loss function is used to improve the adverse effects of small target and low-quality data on the gradient,thereby accelerating the convergence speed of the network.The experimental results show that the average detection accuracy is increased by 4.6 percentage points and 3.7 percentage points on val set and test set in VisDrone data set respectively,and by 2.4 percentage points on DOTA data set,demonstrating the effectiveness and universality of the improved algorithm.
target detectionunmanned aerial vehicle perspectiveYOLOv8BiFormerfeature fusionloss function