基于解耦注意力与幻影卷积的轻量级人体姿态估计

扫码查看

原文链接

万方数据
维普

中文摘要：随着轻量级网络的发展,人体姿态估计任务得以在计算资源有限的设备上执行,然而,提升精度变得更具有挑战性.这些挑战主要源于网络复杂度与计算资源的矛盾,导致模型在简化时牺牲了表示能力.针对上述问题,提出一种基于解耦注意力和幻影卷积的轻量级人体姿态估计网络(DGLNet).具体来说,DGLNet以小型高分辨率网络(Small HRNet)模型为基础架构,通过引入解耦注意力机制构建DFDbottleneck模块;采用shuffleblock的结构对基础模块进行重新设计,即用轻量级幻影卷积替代计算量大的点卷积,并利用解耦注意力机制增强模块性能,从而构建DGBblock模块;此外,用幻影卷积和解耦注意力重新构建的深度可分离卷积模块来替代原过渡层模块,从而构建GSCtransition模块,进一步减少计算量并增强特征交互性和提高性能.在COCO验证集上的实验结果显示,DGLNet优于轻量级高分辨率网络(Lite-HRNet),在计算量和参数量不增加的情况下,最高精度达到了71.9%;与常见的轻量级姿态估计网络MobileNetV2和ShuffleNetV2相比,DGLNet在仅使用21.2%和25.0%的计算量情况下分别实现了4.6和8.3个百分点的精度提升;在AP50的评价标准上,DGLNet超过了大型高分辨率网络(HRNet)的同时计算量和参数量远小于HRNet.

外文标题：Lightweight human pose estimation based on decoupled attention and ghost convolution

外文摘要：With the development of lightweight networks,human pose estimation tasks can be performed on devices with limited computational resources.However,improving accuracy has become more challenging.These challenges mainly led by the contradiction between network complexity and computational resources,resulting in the sacrifice of representation capabilities when simplifying the model.To address these issues,a Decoupled attention and Ghost convolution based Lightweight human pose estimation Network(DGLNet)was proposed.Specifically,in DGLNet,with Small High-Resolution Network(Small HRNet)model as basic architecture,by introducing a decoupled attention mechanism,DFDbottleneck module was constructed.The basic modules were redesigned with shuffleblock structure,in which computationally-intensive point convolutions were replaced with lightweight ghost convolutions,and the decoupled attention mechanism was utilized to enhance module performance,leading to the creation of DGBblock module.Additionally,the original transition layer modules were replaced with redesigned depthwise separable convolution modules that incorporated ghost convolution and decoupled attention,resulting in the construction of GSCtransition module.This modification further reduced computational complexity while enhancing feature interaction and performance.Experimental results on COCO validation set show that DGLNet outperforms the state-of-the-art Lite-High-Resolution Network(Lite-HRNet)model,achieving the maximum accuracy of 71.9%without increasing computational complexity or the number of parameters.Compared to common lightweight pose estimation networks such as MobileNetV2 and ShuffleNetV2,DGLNet achieves the precision improvement of 4.6 and 8.3 percentage points respectively,while only utilizing 21.2%and 25.0%of their computational resources.Furthermore,under the AP50 evaluation criterion,DGLNet surpasses the large High-Resolution Network(HRNet)while having significantly less computational and parameters.

外文关键词：

human pose estimationlightweight networkattention mechanismghost convolutiondepthwise separable convolution module

作者：

陈俊颖、郭士杰、陈玲玲

展开 >

作者单位：

复旦大学工程与应用技术研究院,上海 200433

河北工业大学机械工程学院,天津 300130

智能康复装置与检测技术教育部工程研究中心(河北工业大学),天津 300401

河北工业大学人工智能与数据科学学院,天津 300130

展开 >

关键词：

人体姿态估计轻量级网络注意力机制幻影卷积深度可分离卷积模块

出版年：

2025

DOI：

10.11772/j.issn.1001-9081.2024010099

计算机应用

中国科学院成都计算机应用研究所

计算机应用

北大核心

影响因子：0.892

ISSN：1001-9081

年,卷(期)：2025.45(1)