In order to solve the problems of grabbing occluded workpieces with different positions and size in industrial production environment,an improved visual pose estimation network based on PVNet is proposed.On the basis of its backbone network,Res2Net is used to replace the original residual structure to improve the extraction performance of multi-scale features.Then,the unit vector of each pixel pointing to the key point is obtained by using the network regression,and the key point position is calculated by using the voting algorithm.Finally,the workpiece pose is calculated by using EPnP.The experiment uses the depth camera to take photos and create datasets for test.The experiment takes the 2D projection index and the average 3D distance of model points as the evaluation standard.The results show that the improved pose estimation network effectively improves the detection accuracy and multi-scale ability,and has good robustness to occluded objects,and the processing speed can also meet the requirements of robot grasping in the actual production.