基于改进高分辨率网络的人体姿态估计

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：为实现更精准的人体关键点定位,以高分辨率检测网络(HRNet)为基线引入瀑布式空洞空间卷积模块与Transformer的人体姿态估计模型和算法.首先,构建瀑布式空洞空间卷积模块替换HRNet的第4阶段,减少不同尺度特征相互融合导致参数量过大的问题,并更高效地提取多尺度特征;其次,引入基于自注意力机制的Transformer对提取的高层特征进行处理,通过捕获全局空间中关键点的非局部交互关系以获取全局信息实现特征增强.实验表明,当输入图像分辨率为256×192时,所提模型相较于HRNet-W32和HRNet-W48基线模型,在参数量下降的情况下AP分别提升2.4%、2.3%.

外文标题：Human Pose Estimation Based on Improved High-Resolution Network

外文摘要：To achieve more accurate positioning of human body key points,a human pose estimation model and algorithm are introduced based on a high-resolution detection network(HRNet)with a waterfall shaped cavity spatial convolution module and Transformer.Firstly,a waterfall like hollow space convolution module is constructed to replace the fourth stage of HRNet,reducing the problem of large parameter quantities caused by the fusion of features at different scales and extracting multi-scale features more efficiently;Then,a Transformer based on self attention mechanism is introduced to process the extracted high-level features,and feature enhancement is achieved by capturing the non local interaction relationships of key points in the global space to obtain global information.The experiment shows that when the input im-age resolution is 256×192,the proposed model improves AP by 2.4%and 2.3%respectively compared to the HRNet-W32 and HRNet-W48 baseline models with a decrease in parameter count.

外文关键词：

human pose estimationhigh-resolution networkwaterfall dilated convolutionattention mechanismmulti-scale

作者：

刘洁、陈志、岳文静

展开 >

作者单位：

南京邮电大学计算机学院

南京邮电大学通信与信息工程学院,江苏南京 210003

关键词：

人体姿态估计高分辨率网络瀑布式空洞卷积注意力机制多尺度

出版年：

2024

DOI：

10.11907/rjdk.241182

软件导刊

湖北省信息学会

软件导刊

影响因子：0.524

ISSN：1672-7800

年,卷(期)：2024.23(6)