健身行为的人体姿态估计及动作识别

Human posture estimation and movement recognition in fitness behavior

付惠琛 ¹高军伟 ¹车鲁阳¹

扫码查看

作者信息

1. 青岛大学自动化学院, 山东青岛 266071;山东省工业控制技术重点实验室, 山东青岛 266071
折叠

摘要

人体姿态估计和动作识别在安防、医疗和运动等领域有着重要的应用价值.为了解决不同背景及角度下各类运动动作的人体姿态估计和动作识别问题,本文提出了一种改进的YOLOv7-POSE算法,并自行拍摄制作各种拍摄角度的数据集进行训练.此算法以YOLOv7为基础,对原始网络模型添加了分类的功能,在Backbone主干网络中引入CA卷积注意力机制,提升了网络在对人体骨骼关节点和动作的分类的重要特征的识别能力.用HorNet网络结构代替原模型的CBS卷积核,提高了模型的人体关键点检测精度和动作分类的准确度.将Head层的空间金字塔池化结构替换为空洞空间金字塔池化结构,提升了检测精度并且加快了模型收敛.将目标检测框的回归函数由CIOU替换为EIOU,提高了坐标回归的精度.设计了两组对照实验,实验结果证明,改进后的YOLOv7-POSE在验证集上的mAP为95.7%,相比于原始YOLOv7算法提高了4%,各类运动动作识别准确率显著上升,在实际推理中的关键点错检、漏检等情况明显减少,关键点位置估计误差明显降低.

Abstract

Human pose estimation and motion recognition have important application value in the fields of security,medical treatment and sports.In order to solve the problem of human pose estimation and motion recognition of various movements under complex background,an improved YOLOv7-POSE algorithm is proposed,and data sets of various shooting angles are made by oneself for training.Based on YOLOv7,this algorithm adds classification function to original network model.CA convolutional attention mechanism is introduced into Backbone network,which improves recognition ability of important features in the classification of human bone nodes and actions.The CBS convolution kernel of original model is replaced by HorNet network structure,which improves detection accuracy of human key points and accuracy of action classification.The pyramidal structure of the Head layer is replaced by pyramidal structure of empty space,which improves the precision and speeds up model convergence.The regression function of target detection box is replaced by CIOU with EIOU,which improves the precision of coordinate regression.The data sets of bodybuilding movements under complex background and various shooting angles are made by self-shooting,and the comparison experiment is carried out on the self-made data set.Experimental results show that mAP of the improved Yolov7-POSE on the test set is 95.7%,4%higher than that of original YOLOv7 algorithm.The recognition accuracy of all kinds of movements increases significantly,and the detection of key point errors and omissions decreases significantly.

关键词

图像处理/关键点检测/姿态估计/注意力机制/空洞空间金字塔池化

Key words

image processing/key point detection/pose estimation/convolutional attention mechanism/atrous spatial pyramid pooling

引用本文复制引用

基金项目

山东省自然科学基金(ZR2019MF063)

出版年

2024

液晶与显示

中科院长春光学精密机械与物理研究所中国光学光电子行业协会液晶分会中国物理学会液晶分会

液晶与显示

CSTPCD北大核心

影响因子：0.964

ISSN：1007-2780

参考文献量4

段落导航