Deep-Learning-Based 2D Human Pose Estimation:Present and Future
2D human pose estimation aims to identify and locate the human body keypoints of each person in images.As a fundamental task in human analysis and understanding field,human pose estimation can support multiple downstream tasks and can be applied to many real-world applications.In recent years,thanks to the developments of deep learning techniques,significant progresses have been made to human pose estimation.Based on the number of persons in image,human pose estimation tasks can be summarized into single-person pose estimation and the more challenging multi-person pose estimation,respectively.This paper first introduces the research background,problem definition,task difficulty and keypoint representation of human pose estimation task.Next,we introduce the representative single-person and multi-person pose estimation methods,respectively.The single-person pose estimation section introduces regression-based and detection-based methods,including network structure designing,heatmap encoding/decoding and multi-task learning categories.The multi-person pose estimation section introduces heatmap based methods and regression-based methods.We further summarize the widely-used datasets,benchmark metric,and the performance of representative methods on these datasets.This paper also selects representative methods from each category,and analyzes and compares the failure cases of these methods.Finally,this paper discusses the remaining challenges and promising research directions in human pose estimation.