合肥学院学报(综合版)2024,Vol.41Issue(5) :94-101.

基于自注意力图卷积网络的人体骨架行为识别

Self-Attention Enhanced Graph Convolution Network for 3D Skeleton Based Human Action Recognition

丁悦 吴志泽
合肥学院学报(综合版)2024,Vol.41Issue(5) :94-101.

基于自注意力图卷积网络的人体骨架行为识别

Self-Attention Enhanced Graph Convolution Network for 3D Skeleton Based Human Action Recognition

丁悦 1吴志泽1
扫码查看

作者信息

  • 1. 合肥大学 人工智能与大数据学院,合肥 230601
  • 折叠

摘要

目前基于普通图卷积网络的方法主要依赖局部性的图卷积操作,限制了其对远距离关节间复杂关联的灵活捕捉能力.提出一种自注意力增强图卷积网络(Self-Attention Enhanced Graph Convolutional Network,SG-Net),根据骨架数据的特性,对每个关节点的通道进行独立的全局性建模,即通道特定的全局空间建模(Chan-nel-Specific Global Spatial Modeling,C-GSM),并行于局部空间建模(Local Spatial Modeling,LSM),以提取局部和全局的空间特征表示.在两个大型且具有挑战性的基准数据集NTU RGB+D和NTU RGB+D120上进行了广泛的实验研究.与最新相关方法的比较,SGNet表现得非常有竞争性,在NTU RGB+D X-Sub和NTU RGB+D 120 X-Set上分别取得了92.9%和90.7%的最高准确率.

Abstract

Existing methods based on standard graph convolutional networks mainly rely on local graph convolution operations,limiting their flexibility in capturing complex long-range associations between joints.To address these issues,a Self-Attention Enhanced Graph Convolutional Network(SGNet)is proposed.Leveraging the characteristics of skeletal data,independent global modeling is performed for each channel related to key points,specifically termed Channel-Specific Global Spatial Modeling(C-GSM).This is carried out in parallel with a local spatial modeling(LSM)to extract local and glob-al spatial feature representations.Extensive experimental research was conducted on two large and challenging benchmark datasets,NTU RGB+D and NTU RGB+D120.SGNet demonstrated highly competitive results in comparison with the state-of-the-art methods,achieving the highest accuracy rates of 92.9%on NTU RGB+D X-Sub and 90.7%on NTU RGB+D 120 X-Set,respectively.

关键词

人体骨架行为识别/图卷积/自注意力

Key words

skeleton-based human action recognition/graph convolution network/self-attention

引用本文复制引用

出版年

2024
合肥学院学报(综合版)
合肥学院

合肥学院学报(综合版)

影响因子:0.426
ISSN:2096-2371
段落导航相关论文