结合坐标Transformer的轻量级人体姿态估计算法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对现有的大多数自底向上人体姿态估计算法存在模型规模大、计算成本高及对边缘设备不友好等问题,提出了一种基于YOLOv5s6-Pose的轻量级多人姿态估计网络模型YOLOv5s6-Pose-CT.该模型在颈部网络中引入空间和通道重建卷积,以减少空间和通道维度上的特征冗余.同时,提出了一种坐标Transformer嵌入于主干网络中,使模型专注于长距离依赖和拥有高效的局部特征提取能力.其次,通过使用无偏特征位置对齐来解决多尺度融合过程中出现的特征错位问题.最后,使用损失函数MPDIoU对边界框的回归损失重新定义.在COCO 2017 数据集上的实验结果表明,本文优化的网络模型与主流的轻量级网络EfficientHRNet-H1 模型相比,在保持相同精度的同时,参数量和计算量分别减少 16.2%和 66.1%.相比于基准模型YOLOv5s6-Pose,参数量减少 11.2%,计算量降低 5.8%,平均检测精度和平均召回率分别提升 2.5%和 2.6%.

外文标题：Lightweight human pose estimation algorithm combined with coordinate Transformer

外文摘要：Addressing issues such as large model size,high computational costs,and limited compatibility with edge devices in most existing bottom-up human pose estimation algorithms,this study proposed a lightweight multi-person pose estimation network model named YOLOv5s6-Pose-CT based on YOLOv5s6-Pose.In order to reduce feature redundancy across both spatial and channel dimensions,the network model introduced spatial and channel reconstruction convolution in the neck network.Simultaneously,a coordinate Transformer was incorporated into the backbone network to enhance long-distance dependence while maintaining efficient local feature extraction ability.Furthermore,unbiased feature position alignment was employed to resolve feature dislocation during multi-scale fusion.Finally,this study redefined the regression loss of bounding boxes using the MPDIoU(minimum point distance-based IoU)loss function.Experimental results on the COCO 2017 dataset demonstrated that compared with EfficientHRNet-H1(a mainstream lightweight network),our optimized network model reduced parameters by 16.2%and computation by 66.1%,respectively,while maintaining comparable accuracy levels.Moreover,compared with the baseline approach,our proposed model achieved parameter and computation reductions of 11.2%and 5.8%,respectively,along with improvements of 2.5%in average detection accuracy and 2.6%in recall rate.

外文关键词：

human pose estimationlightweightcoordinate Transformerunbiased feature position alignmentloss function

作者：

黄友文、林志钦、章劲、陈俊宽

展开 >

作者单位：

江西理工大学信息工程学院,江西赣州 341000

关键词：

人体姿态估计轻量级坐标Transformer 无偏特征位置对齐损失函数

基金：

江西省教育厅项目

项目编号：

GJJ180443

出版年：

2024

DOI：

10.11996/JG.j.2095-302X.2024030516

图学学报

中国图学学会

图学学报

CSTPCD北大核心

影响因子：0.73

ISSN：2095-302X

年,卷(期)：2024.45(3)

参考文献量39