A multiscale target detection network,VTO-YOLOv8,for unmanned aerial vehicle(UAV)images is proposed to address the low accuracy of existing algorithms caused by complex backgrounds,a high proportion of small targets,and uneven distributions.First,wise intersection over union(WIoU)v3 was used as the bounding-box regression loss,and a wise gradient allocation strategy was employed for the network to focus more on regular quality samples and improve localization ability.Second,a four-layer target bi-directional feature pyramid network(T-BiFPN)structure was designed to strengthen the integration of shallow and deep features.Furthermore,a faster implementation of CSP bottleneck with diverse branch blocks(C2f-DBB)module was designed to improve the detection performance of the network without increasing computational complexity.In addition,a focal modulation module was used to enhance the interaction of information at different scales.The experimental results demonstrated that the proposed network improved the mean average precision(mAP)and mAP50 by 5.9%and 9.0%,respectively,compared with those of the baseline network on the Visdrone2019 dataset.Moreover,the network parameters were reduced by 22.6%.The proposed method can be applied to target detection in UAV aerial photography.