基于YOLO-Pose的城市街景小目标行人姿态估计算法

Pose Estimation Algorithm for Small Target Pedestrians in Urban Street View Based on YOLO-Pose

马明旭 ¹马宏 ²宋华伟¹

扫码查看

作者信息

1. 郑州大学网络空间安全学院,河南郑州 450000
2. 战略支援部队信息工程大学信息技术研究所,河南郑州 450000
折叠

摘要

现有的姿态估计算法在城市街景中对小目标行人的检测效果不佳.针对该问题,提出一种基于YOLO-Pose的小目标行人姿态估计算法YOLO-Pose-CBAM.通过引入CBAM注意力机制模块,在不增加过多计算量的前提下,增强网络聚焦小目标行人区域的能力,提升算法对小目标行人的敏感度,同时在主干网络中使用4个不同尺寸的检测头,丰富算法对图片中不同大小行人的检测手段;在骨干网络和颈部之间架设2条跨层级联通道,提升浅层网络与深层网络之间的特征融合能力,进一步增强信息交流,降低小目标行人漏检率;引入SIoU重新定义边界框回归的定位损失函数,加快训练的收敛速度,提高检测精度;采用k-means++算法代替k-means算法对数据集中标注的锚框进行聚类,避免聚类中心初始化时导致的局部最优解问题,从而选择出更适合检测小目标行人的锚框.对比实验结果表明,在小目标行人WiderKeypoints数据集上,所提算法相较于YOLO-Pose和YOLOv7-Pose在平均精度上分别提升了 4.6和6.5个百分比.

Abstract

To address the problem that existing attitude estimation algorithms are not effective in detecting small target pedestrians in an urban streetscape,this study proposes a pose estimation algorithm for small target pedestrian,YOLO-Pose-CBAM,based on YOLO-Pose.First,the CBAM attention mechanism module is introduced to enhance the ability of the network to focus on small target pedestrian areas and improve the sensitivity of the algorithm to small target pedestrians on the premise of not increasing the computation excessively.Simultaneously,four detection heads of different sizes are used in the trunk network to enrich the detection means of the algorithm for pedestrians of different sizes.Second,two cross layer cascading channels are constructed between the Backbone and Neck,which improves the feature fusion ability between the shallow and deep networks,further enhancing the information exchange and reducing the missed rate of small target pedestrians.Furthermore,the SIoU is introduced to redefine the location loss function of the boundary box regression,which can accelerate the convergence speed of the training and improve the detection accuracy.Finally,the k-means++algorithm is used instead of the k-means algorithm to cluster the tagged anchor frames in the dataset,avoiding the local optimal solution problem caused by the initialization of the clustering center to select the anchor frame that is more suitable for detecting small target pedestrians.Compared with the experimental results,the Average Precision(AP)of the proposed algorithm for the small target pedestrian WiderKeypoints dataset is improved by 4.6 percentage points compared with that of YOLO-Pose and by 6.5 percentage points compared with that of YOLOv7-Pose.

关键词

YOLO-Pose算法/姿态估计/跨层级联/CBAM注意力机制/SIoU损失函数/k-means++算法

Key words

YOLO-Pose algorithm/pose estimation/cross layer cascading/CBAM attention mechanism/SIoU loss function/k-means++algorithm

引用本文复制引用

基金项目

河南省科技重大专项(221100210100)

出版年

2024

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

参考文献量32

段落导航