Pedestrian Detection Algorithms Incorporating Contextual Information and Attention Mechanisms
To address the challenges of incomplete feature extraction and low detection accuracy in complex traffic scenarios,a pedestrian detection algorithm YOLOv5-STRDC based on the YOLOv5 network improved by fusing context information and attention mechanism is proposed.Firstly,the Swin Transformer is placed in the backbone to enrich contextual information while efficiently acquiring global information.Secondly,the Spatial Pyramid Pooling(SPP)module that fuses five parallel null convolutions and improved Convolutional Block Attention Module(CBAM)attention mechanism is proposed,which outputs a larger image range of information while enhancing feature fusion in terms of channel and spatial dimensions,respectively.Finally,the Coordinate Attention(CA)module is integrated to highlight important local regions to extract more accurate feature information.The YOLOv5-STRDC algorithm achieves better pedestrian detection.It achieves a mean Average Precision(mAP)of 71.60%and 92.01%on the publicly available WiderPerson dataset and INRIA dataset,respectively,which is an improvement of 1.80%and 1.34%compared to the YOLOv5 model.Meanwhile,the detection frame rate of the proposed algorithm reaches 137.34 and 114.71 frame/s respectively,which meets the requirement of real-time detection.