Lightweight human pose estimation network based on attention mechanish
In response to the limited computing resources and storage space of mobile devices such as smart homes,a high-resolution network is used as the backbone,and a lightweight convolutional neural network ShuffleNetV2 and attention mechanism are introduced to propose an efficient and lightweight network.A new block is proposed,which replaces the ShuffleNetV2's 3×3 depthwise separable convolution with a multi-head self-attention layer of Vision Transformer.Replace the second 3×3 convolution and all residual blocks in the last three stages of the high-resolution network with a new block.A new output module has been designed,which includes a parallel dual attention structure and convolution followed by sampling.Two different sizes of networks were obtained by setting different numbers of modules,channels,and multi-heads.The experimental results show that the two networks have higher accuracy compared to other mainstream human pose estimation networks,with a parameter reduction of more than 50%,achieving the goal of lightweight.