3D Human Pose Estimation Network Combining Graph Convolution and Transformer
The two-stage 3D human pose estimation method has made significant progress due to advanced 2D pose detectors,but the ambiguity of depth information still makes this task extremely challenging.To solve this problem,a 3D human pose estimation network based on MGCNTrans is proposed.This method adopts a 2D-3D boosting strategy.The MGCNTrans network combines the advantages of Transformer network and spatial channel graph convolutional network.This model takes multiple frames of data as input and utilizes information from surrounding frames to constrain the pose estimation of the current frame.In terms of feature learning,graph convolutional networks are used to learn the physical connections between human joints and capture local spatial features.The Transformer network mines the implicit relationships between joints and provides global contextual information.To improve model performance,the graph convolutional layer integrates spatial and channel layers,enabling better interaction between nodes at both local and global scales,increasing feature diversity,and more accurately estimating human pose.The results show that MGCNTrans network has achieved superior performance in 3D human posture estimation task,which proves its effectiveness and progressiveness.
3D human pose estimationgraph convolutional networkTransformer network