Small-Scale Object Detection Algorithm Integrating Attention and Feature Pyramids
A modified Faster R-CNN algorithm is proposed to address the problem of poor detection ability for small-scale objects and occluded or truncated objects,combining the CBAM mechanism and feature pyramid structure.To focus on the efficient use of local information in feature images,the CBAM mechanism is integrated into the feature extraction network to reduce the interference of invalid targets and improve the detection ability,notwithstanding occluded or truncated objects.This introduces a Feature Pyramid Network(FPN)structure to connect high-and low-level feature data,obtaining high-resolution and strong semantic data,thereby enhancing the detection effect of small objects.To alleviate the phenomenon of gradient vanishing and reduce the scale of hyperparameters,the commonly used VGG16 network is replaced with a strong expressive ability of the inverse residual VS-ResNet network.VS-ResNet modifies some hierarchical structures based on the original ResNet 50,adds auxiliary classifiers,designs inverse residual and group convolution methods,such that the activation function information is fully preserved in high-dimensional environments,and improves detection accuracy.The reset candidate box score calculation method is used to compensate for the defect of the Non-Maximum Suppression(NMS)algorithm in mistakenly eliminating overlapping detection boxes.The experimental results demonstrate that compared to VGG16,VS-ResNet has a 2.97 percentage points improvement in accuracy on the CIFAR-10 dataset.The target detection mAP value of the proposed algorithm on the Pascal VOC 2012 dataset is 76.2%,which is 13.9 percentage points higher than that of the original Faster R-CNN algorithm.
deep learningattention mechanismfeature pyramidsmall object detectiontruncated object detection