基于骨骼点动态时域滤波的人体动作识别

Human action recognition based on skeleton dynamic temporal filter

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：人体动作识别是计算机视觉的重要研究方向,广泛应用于智能监控、人机交互等领域.现有基于骨骼点的动作识别方法多采用图卷积网络(GCN)和时间卷积网络(TCN)级联的方式实现,而后者卷积核的尺寸限制了模型的全局时间建模能力.此外,仅使用卷积处理骨骼点数据缺乏对于不同骨骼点的区分能力,并且TCN提取特征时往往会重复计算,使得TCN的参数量随着网络层数的加深而增大.借助信号处理的方法提出了一种适用于骨骼点的动态时域滤波模块(SDTF),用于代替 TCN对时间特征进行全局建模,并在此基础上对AGCN进行轻量化改进,提出的AGCN-SDTF动作识别模型降低了模型复杂度.SDTF通过傅里叶变换对时间特征进行建模,将傅里叶变换得到的频域特征与滤波得到的频域输出相乘再经过傅里叶逆变换,从而实现对全局时间特征的提取.在 NTU-RGBD 和 Kinetics-Skeleton 大型数据集上的实验结果表明,该模型在达到与原模型相同的识别效果时,降低了模型所需的参数量和计算量.

外文摘要：Human action recognition is one of the key research areas in computer vision,with a wide range of applications such as human-computer interaction and intelligent surveillance.Existing methods for skeleton-based action recognition often combine graph convolutional networks(GCN)with temporal convolutional networks(TCN).However,the limited size of convolutional kernel restricts the models'global temporal modeling capability.Moreover,applying convolutional kernel to skeletal data leads to a lack of discriminative power among different skeleton points.Furthermore,using TCN to extract features often entails repeated calculations,leading to an increase in the parameter quantity of TCN as the network deepens.To address these issues,signal processing methods were utilized,and skeleton dynamic temporal filtering(SDTF)module was proposed for skeleton action recognition to replace TCN for global modeling.Based on this,lightweight improvements were made to AGCN,reducing the complexity.SDTF modeled temporal features through Fourier transform,multiplying the frequency domain features obtained from Fourier transform with the filtered frequency domain output,and then undergoing inverse Fourier transform.Extensive experiments conducted on the NTU-RGBD and Kinetics-Skeleton datasets demonstrated that the proposed model significantly reduced network parameters and computational complexity,while achieving comparable or even superior recognition performance compared to the original model.

外文关键词：

human action recognitiongraph convolutional networkdynamic temporal filterFourier transformtemporal convolutional networks

作者：

李松洋、王雪婷、陈相龙、陈恩庆

展开 >

作者单位：

郑州大学电气与信息工程学院,河南郑州 450001

关键词：

人体动作识别图卷积网络动态时域滤波傅里叶变换时间卷积网络

基金：

国家自然科学基金项目国家自然科学基金项目河南省科技攻关项目国家超级计算郑州中心支持项目

项目编号：

62101503U1804152222102210102

出版年：

2024

DOI：

10.11996/JG.j.2095-302X.2024040760

图学学报

中国图学学会

图学学报

CSTPCD北大核心

影响因子：0.73

ISSN：2095-302X

年,卷(期)：2024.45(4)