首页|结合坐标Transformer的轻量级人体姿态估计算法

结合坐标Transformer的轻量级人体姿态估计算法

扫码查看
针对现有的大多数自底向上人体姿态估计算法存在模型规模大、计算成本高及对边缘设备不友好等问题,提出了一种基于YOLOv5s6-Pose的轻量级多人姿态估计网络模型YOLOv5s6-Pose-CT.该模型在颈部网络中引入空间和通道重建卷积,以减少空间和通道维度上的特征冗余.同时,提出了一种坐标Transformer嵌入于主干网络中,使模型专注于长距离依赖和拥有高效的局部特征提取能力.其次,通过使用无偏特征位置对齐来解决多尺度融合过程中出现的特征错位问题.最后,使用损失函数MPDIoU对边界框的回归损失重新定义.在COCO 2017 数据集上的实验结果表明,本文优化的网络模型与主流的轻量级网络EfficientHRNet-H1 模型相比,在保持相同精度的同时,参数量和计算量分别减少 16.2%和 66.1%.相比于基准模型YOLOv5s6-Pose,参数量减少 11.2%,计算量降低 5.8%,平均检测精度和平均召回率分别提升 2.5%和 2.6%.
Lightweight human pose estimation algorithm combined with coordinate Transformer
Addressing issues such as large model size,high computational costs,and limited compatibility with edge devices in most existing bottom-up human pose estimation algorithms,this study proposed a lightweight multi-person pose estimation network model named YOLOv5s6-Pose-CT based on YOLOv5s6-Pose.In order to reduce feature redundancy across both spatial and channel dimensions,the network model introduced spatial and channel reconstruction convolution in the neck network.Simultaneously,a coordinate Transformer was incorporated into the backbone network to enhance long-distance dependence while maintaining efficient local feature extraction ability.Furthermore,unbiased feature position alignment was employed to resolve feature dislocation during multi-scale fusion.Finally,this study redefined the regression loss of bounding boxes using the MPDIoU(minimum point distance-based IoU)loss function.Experimental results on the COCO 2017 dataset demonstrated that compared with EfficientHRNet-H1(a mainstream lightweight network),our optimized network model reduced parameters by 16.2%and computation by 66.1%,respectively,while maintaining comparable accuracy levels.Moreover,compared with the baseline approach,our proposed model achieved parameter and computation reductions of 11.2%and 5.8%,respectively,along with improvements of 2.5%in average detection accuracy and 2.6%in recall rate.

human pose estimationlightweightcoordinate Transformerunbiased feature position alignmentloss function

黄友文、林志钦、章劲、陈俊宽

展开 >

江西理工大学信息工程学院,江西 赣州 341000

人体姿态估计 轻量级 坐标Transformer 无偏特征位置对齐 损失函数

江西省教育厅项目

GJJ180443

2024

图学学报
中国图学学会

图学学报

CSTPCD北大核心
影响因子:0.73
ISSN:2095-302X
年,卷(期):2024.45(3)
  • 39