Target detection refers to the identification and positioning of object types in the image.The one-stage target detec-tion algorithm obtains classification information and target location information from the feature map output by the deep net-work.However,the deep features lack spatial information due to long distance convolution and down-sampling processing.To solve this problem,refer to the idea of semantic segmentation.The YOLOv5 algorithm for target detection in the first stage is improved by combining the shallow features in the backbone network with the deep features obtained from up sampling.The backbone network is Resnet-50 to effectively extract the feature map information.Attention mechanism is introduced into the residual block to effectively select the object information in the shallow feature extraction stage,and more weights can be allo-cated to small and weak objects to improve the ability of feature representation to detect small objects accurately.In addition,according to the characteristics of pedestrian detection data,the generation method and loss function of the preselected box are improved.The experimental results on INRIA and Caltech data sets show that the proposed improved model improves the de-tection effect and retrieval speed.
关键词
深度学习/YOLOv5/图像融合/目标检测/行人目标
Key words
deep learning/YOLOv5/image fusion/target detection/pedestrian target