计算机工程与科学2024,Vol.46Issue(5) :852-860.DOI:10.3969/j.issn.1007-130X.2024.05.011

改进YOLOv5的多人姿态估计修正算法

A multi-person pose estimation correction algorithm based on improved YOLOv5

赵金源 贾迪
计算机工程与科学2024,Vol.46Issue(5) :852-860.DOI:10.3969/j.issn.1007-130X.2024.05.011

改进YOLOv5的多人姿态估计修正算法

A multi-person pose estimation correction algorithm based on improved YOLOv5

赵金源 1贾迪1
扫码查看

作者信息

  • 1. 辽宁工程技术大学电子与信息工程学院,辽宁 葫芦岛 125100
  • 折叠

摘要

由于拥挤场景中的多人姿态估计仍受检测目标较小等问题的影响,导致姿态估计准确率低,为此提出一种改进YOLOv5 的多人姿态估计修正算法.首先,在YOLOv5 的骨干网络中,融入跳跃注意力模块,帮助网络在图像中找到感兴趣区域;其次,在颈部网络中,利用加权双向特征金字塔提高网络对不同尺度特征图间的特征融合能力,并联合使用跳跃注意力模块与Transformer编码器,使网络获取全局信息和丰富的上下文信息;再次,在检测部分增加一个检测头,使网络对微小目标更加敏感;最后,利用网络预测得到的关键点对象信息修正姿态对象信息得到最终的多人姿态估计结果.实验结果表明,本文算法较YOLOv5 在COCO 数据集上AP50 提高了 2.2%,AP75 提高了 3.3%,验证了本文算法的精确性和鲁棒性.

Abstract

Since the multi-person pose estimation in crowded scenes is still affected by the problems of small detection objects,resulting in low accuracy of pose estimation,this paper proposes a multi-person pose estimation correction algorithm based on improved YOLOv5.Firstly,in the backbone net-work of YOLOv5,a jump attention module is integrated to help the network find the region of interest in the image.Secondly,in the neck network,the weighted bidirectional feature pyramid is used to im-prove the feature fusion ability between feature maps of different scales,and the jump attention module and Transformer encoder are used jointly to enable the network to obtain global information and rich context information.Thirdly,a detection head is added to the detection part to make the network more sensitive to tiny objects.Finally,the key point object information obtained by network prediction is used to modify the attitude object information to obtain the final multi-person pose estimation result.Experi-mental results show that the proposed algorithm improves YOLOv5's AP50 by 2.2%and AP75 by 3.3%on the COCO dataset,validating the accuracy and robustness of this algorithm.

关键词

人体姿态估计/跳跃注意力机制/加权特征金字塔/Transformer编码器/目标检测

Key words

person pose estimation/jump attention mechanism/weighted feature pyramid/Trans-former encoder/object detection

引用本文复制引用

基金项目

国家自然科学基金(61601213)

辽宁省教育厅项目(LJ2020FWL004)

辽宁省教育厅项目(2019-ZD-0038)

出版年

2024
计算机工程与科学
国防科学技术大学计算机学院

计算机工程与科学

CSTPCD北大核心
影响因子:0.787
ISSN:1007-130X
参考文献量15
段落导航相关论文