Implicit Modeling Network of Human Keypoints Based on Attention Mechanism
Human pose estimation necessitates the use of visual cues and anatomical joint relationships to pinpoint key points.Existing Convolutional Neural Network(CNN)methods falter in addressing long-range contextual cues and modeling dependencies among distant joints.This paper introduces an attention-based implicit modeling method that iteratively computes feature correlations between joints,thus implicitly modeling the constraint relationships among key points.This method diverges from the localized operations characteristic of CNN by expanding the network's receptive field and modeling dependencies between distantly positioned joints.To counteract the diminished visibility of crucial keypoints during network training,a focal loss function is implemented,prompting the network to concentrate on complex keypoints.Comparative experiments were performed under identical conditions using the state-of-the-art High-Resolution Network(HRNet)and the classic Residual Network(ResNet)as backbone networks.Results reveal that the implicit modeling network enhances human pose estimation performance.For instance,utilizing HRNet as the backbone,the algorithm's accuracy on the MPII and MSCOCO human pose estimation benchmark datasets improved by 1.7%and 2.6%,respectively,surpassing the original network's performance.
human pose estimationconvolutional neural networkattention mechanismfocal lossimplicit modeling