首页|基于深度学习的二维人体姿态估计:现状及展望

基于深度学习的二维人体姿态估计:现状及展望

扫码查看
二维人体姿态估计旨在从摄像机拍摄的图像中识别并定位每个行人的人体关键点.作为行人分析和理解领域的基础任务之一,人体姿态估计能够为多个下游任务和应用提供支持.近年来,随着深度学习技术的进步,人体姿态估计的研究迎来快速发展.基于图像包含的行人数量,人体姿态估计可以分为单人姿态估计和多人姿态估计两大类.本文首先介绍人体姿态估计的研究背景、问题定义、任务难点以及当前方法中的关键点表示方法.在此基础上,本文进一步总结和介绍了具有代表性的单人姿态估计和多人姿态估计方法.单人姿态估计方法包括回归法和检测法,主要关注于网络结构设计、热力图编解码、多任务学习等.对于多人姿态估计,本文分别介绍了基于热力图预测的方法和基于向量场回归的方法.随后,本文总结了当前常用的代表性数据集和性能度量方法,总结了代表性方法在几个常用数据集上的性能,对它们的预测错误的场景进行了详细分析和对比.最终,本文分析了现有二维人体姿态估计算法仍未有效解决的难题,对未来研究进行了展望.
Deep-Learning-Based 2D Human Pose Estimation:Present and Future
2D human pose estimation aims to identify and locate the human body keypoints of each person in images.As a fundamental task in human analysis and understanding field,human pose estimation can support multiple downstream tasks and can be applied to many real-world applications.In recent years,thanks to the developments of deep learning techniques,significant progresses have been made to human pose estimation.Based on the number of persons in image,human pose estimation tasks can be summarized into single-person pose estimation and the more challenging multi-person pose estimation,respectively.This paper first introduces the research background,problem definition,task difficulty and keypoint representation of human pose estimation task.Next,we introduce the representative single-person and multi-person pose estimation methods,respectively.The single-person pose estimation section introduces regression-based and detection-based methods,including network structure designing,heatmap encoding/decoding and multi-task learning categories.The multi-person pose estimation section introduces heatmap based methods and regression-based methods.We further summarize the widely-used datasets,benchmark metric,and the performance of representative methods on these datasets.This paper also selects representative methods from each category,and analyzes and compares the failure cases of these methods.Finally,this paper discusses the remaining challenges and promising research directions in human pose estimation.

single-person pose estimationmulti-person pose estimationdeep learningtop-downbottom-upregression

李佳宁、王东凯、张史梁

展开 >

北京大学计算机学院 北京 100871

单人姿态估计 多人姿态估计 深度学习 自顶向下 自底向上 向量场回归

国家自然科学基金国家自然科学基金国家重点研发计划

U20B2052619360112018YFE0118400

2024

计算机学报
中国计算机学会 中国科学院计算技术研究所

计算机学报

CSTPCD北大核心
影响因子:3.18
ISSN:0254-4164
年,卷(期):2024.47(1)
  • 93