仪器仪表学报2024,Vol.45Issue(4) :136-144.DOI:10.19650/j.cnki.cjsi.J2312272

基于Transformer的三维人体姿态估计及其动作达成度评估

Transformer-based 3D Human pose estimation and action achievement evaluation

杨傲雷 周应宏 杨帮华 徐昱琳
仪器仪表学报2024,Vol.45Issue(4) :136-144.DOI:10.19650/j.cnki.cjsi.J2312272

基于Transformer的三维人体姿态估计及其动作达成度评估

Transformer-based 3D Human pose estimation and action achievement evaluation

杨傲雷 1周应宏 2杨帮华 2徐昱琳2
扫码查看

作者信息

  • 1. 上海大学机电工程与自动化学院 上海 200444;上海市电站自动化技术重点实验室 上海 200444
  • 2. 上海大学机电工程与自动化学院 上海 200444
  • 折叠

摘要

针对人机交互、医疗康复等领域存在的人体姿态分析与评估问题,本文提出了一种基于Transformer的三维人体姿态估计及其动作达成度评估方法.首先,本文定义了人体姿态的关键点及关节角,并在深度位姿估计网络(DPEN)的基础上,提出并构建了一个基于Transformer的三维人体姿态估计模型(TPEM),Transformer的引入能够更好的提取人体姿态的长时序特征;其次,利用TPEM模型对三维人体姿态估计结果,设计了基于加权3D关节角的动态时间规整算法,在时序上对不同人物同一动作的姿态进行姿态关键帧的规整匹配,并据此提出了动作达成度评估方法,用于给出动作的达成度分数;最后,通过在不同数据集上进行实验验证,TPEM在Human3.6 M数据集上实现了平均关节点误差为 37.3 mm,而基于加权 3D关节角的动态时间规整算法在Fit3D数据集上的平均误差帧数为 5.08,展现了本文所提方法在三维人体姿态估计与动作达成度评估方面的可行性和有效性.

Abstract

According to the challenges of human pose analysis and assessment in domains such as human-computer interaction and medical rehabilitation,this paper introduces a Transformer-based methodology for 3D human pose estimation and the evaluation of action achievement.Firstly,key points of human pose and their joint angles were defined,and based on the deep pose estimation network(DPEN),a Transformer-based 3D human pose estimation model(TPEM)is proposed and constructed,the incorporation of Transformer facilitates better enhanced extraction of long-term sequential features of human pose.Secondly,the TPEM model's outcomes in 3D human pose estimation are utilized to formulate a dynamic time warping algorithm,which focuses on weighted 3D joint angles.This algorithm temporally aligns pose keyframes for different individuals performing the same action and subsequently introduces an assessment method for action accomplishment to provide scores for the degree of action fulfillment.Finally,through experimental validation across various datasets,TPEM achieves an average joint point error of 37.3 mm on the Human3.6 M dataset,while the dynamic time warping algorithm based on weighted 3D joint angles yields an average error of 5.08 frames on the Fit3D dataset.These results demonstrate the feasibility and effectiveness of the proposed approach for 3D human pose estimation and action accomplishment assessment.

关键词

三维人体姿态估计/深度学习/动态时间规整/动作评估

Key words

3D human pose estimation/deep learning/dynamic time wrapping/action evaluation

引用本文复制引用

基金项目

国家重点研发计划(2023YFF1203503)

上海市自然科学基金(22ZR1424200)

出版年

2024
仪器仪表学报
中国仪器仪表学会

仪器仪表学报

CSTPCD北大核心
影响因子:2.372
ISSN:0254-3087
段落导航相关论文