Due to the existing issues in current human pose estimation algorithms,which struggle with accurately detecting small and large-sized human keypoints as well as having lower precision,this paper proposes a scale invariant convolution neural network to estimate human pose.First,resizing network is constructed for resizing the input image to the standard resolution.This network would reduce feature loss caused by interpolation.Second,the receptive field of network is increased by introducing non-local convolution.Thirdly,resolution attention mechanism is introduced into multi-resolution feature fusion,leading to enhance invariance in scales.Finally,optimized network is designed to reduce quantisation error caused by sampling.Experimental results conducted on the COCO dataset indicate that the proposed algorithm achieves an average accuracy of 79.2%,which is higher than other algorithms.Therefore,the proposed algorithm exhibits better scale invariance and accuracy than existing human pose estimation algorithms.
关键词
人体姿态估计/卷积神经网络/尺度不变性/人体关键点检测/非局部卷积/量化误差
Key words
human pose estimation/convolutional neural network/scale invariance/human keypoints detection/non-local convolution/quantisation error