基于自注意力图卷积网络的人体骨架行为识别

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：目前基于普通图卷积网络的方法主要依赖局部性的图卷积操作,限制了其对远距离关节间复杂关联的灵活捕捉能力.提出一种自注意力增强图卷积网络(Self-Attention Enhanced Graph Convolutional Network,SG-Net),根据骨架数据的特性,对每个关节点的通道进行独立的全局性建模,即通道特定的全局空间建模(Chan-nel-Specific Global Spatial Modeling,C-GSM),并行于局部空间建模(Local Spatial Modeling,LSM),以提取局部和全局的空间特征表示.在两个大型且具有挑战性的基准数据集NTU RGB+D和NTU RGB+D120上进行了广泛的实验研究.与最新相关方法的比较,SGNet表现得非常有竞争性,在NTU RGB+D X-Sub和NTU RGB+D 120 X-Set上分别取得了92.9%和90.7%的最高准确率.

外文标题：Self-Attention Enhanced Graph Convolution Network for 3D Skeleton Based Human Action Recognition

外文摘要：Existing methods based on standard graph convolutional networks mainly rely on local graph convolution operations,limiting their flexibility in capturing complex long-range associations between joints.To address these issues,a Self-Attention Enhanced Graph Convolutional Network(SGNet)is proposed.Leveraging the characteristics of skeletal data,independent global modeling is performed for each channel related to key points,specifically termed Channel-Specific Global Spatial Modeling(C-GSM).This is carried out in parallel with a local spatial modeling(LSM)to extract local and glob-al spatial feature representations.Extensive experimental research was conducted on two large and challenging benchmark datasets,NTU RGB+D and NTU RGB+D120.SGNet demonstrated highly competitive results in comparison with the state-of-the-art methods,achieving the highest accuracy rates of 92.9%on NTU RGB+D X-Sub and 90.7%on NTU RGB+D 120 X-Set,respectively.

外文关键词：

skeleton-based human action recognitiongraph convolution networkself-attention

作者：

丁悦、吴志泽

展开 >

作者单位：

合肥大学人工智能与大数据学院,合肥 230601

关键词：

人体骨架行为识别图卷积自注意力

出版年：

2024

合肥学院学报(综合版)

合肥学院

合肥学院学报(综合版)

影响因子：0.426

ISSN：2096-2371

年,卷(期)：2024.41(5)