Human Pose Estimation Based on Improved High-Resolution Network
To achieve more accurate positioning of human body key points,a human pose estimation model and algorithm are introduced based on a high-resolution detection network(HRNet)with a waterfall shaped cavity spatial convolution module and Transformer.Firstly,a waterfall like hollow space convolution module is constructed to replace the fourth stage of HRNet,reducing the problem of large parameter quantities caused by the fusion of features at different scales and extracting multi-scale features more efficiently;Then,a Transformer based on self attention mechanism is introduced to process the extracted high-level features,and feature enhancement is achieved by capturing the non local interaction relationships of key points in the global space to obtain global information.The experiment shows that when the input im-age resolution is 256×192,the proposed model improves AP by 2.4%and 2.3%respectively compared to the HRNet-W32 and HRNet-W48 baseline models with a decrease in parameter count.
human pose estimationhigh-resolution networkwaterfall dilated convolutionattention mechanismmulti-scale