In order to achieve a better balance between speed and accuracy in the process of clothing landmark de-tection,based on the human pose estimation network YOLOv8s-Pose,a T-shirt landmark detection method named YOLO-T-Shirt is proposed,which utilizes a cascade architecture and fused geometric information.Firstly,inspired by the CFNet architecture,the cascade fusion network design architecture is introduced into YOLOv8s-Pose,with a redesign of the original feature extraction and feature fusion architecture to better integrate multi-scale features,so as to have good robustness to changes in clothing size and shape.Secondly,the native OKS loss function is opti-mized,and an efficient landmark similarity loss function EOKS (Efficient Object Keypoint Similarity ) that in-tegrates integrating geometric information of area,width,height and distance of the center point of the frame is pro-posed to improve the accuracy of landmark detection.The proposed method achieves a prediction accuracy of 0.760 in the landmark detection task of the T-shirt category in the DeepFashion2 dataset,which is close to the accuracy of 0.765 of the current clothing landmark detection algorithm with the highest accuracy,while the inference speed is more than 9 times faster.
关键词
深度学习/服装关键点检测/YOLOv8/级联网络/损失函数优化
Key words
deep learning/landmark detection of clothing/YOLOv8/cascading network/optimization of loss function