应用科学学报2024,Vol.42Issue(2) :189-199.DOI:10.3969/j.issn.0255-8297.2024.02.001

基于双流自适应时空增强图卷积网络的手语识别

Sign Language Recognition Based on Two-Stream Adaptive Enhanced Spatial Temporal Graph Convolutional Network

金彦亮 吴筱溦
应用科学学报2024,Vol.42Issue(2) :189-199.DOI:10.3969/j.issn.0255-8297.2024.02.001

基于双流自适应时空增强图卷积网络的手语识别

Sign Language Recognition Based on Two-Stream Adaptive Enhanced Spatial Temporal Graph Convolutional Network

金彦亮 1吴筱溦1
扫码查看

作者信息

  • 1. 上海大学通信与信息工程学院,上海 200444;上海大学上海先进通信与数据科学研究院,上海 200444
  • 折叠

摘要

针对提取手语特征过程中出现的信息表征能力差、信息不完整问题,设计了一种双流自适应时空增强图卷积网络(two-stream adaptive enhanced spatial temporal graph convolutional network,TAEST-GCN)实现基于孤立词的手语识别.该网络使用人体身体、手部和面部节点作为输入,构造基于人体关节和骨骼的双流结构.通过自适应时空图卷积模块生成不同部位之间的连接,并充分利用其中的位置和方向信息.同时采用残差连接方式设计自适应多尺度时空注意力模块,进一步增强该网络在空域和时域的卷积能力.将双流网络提取到的有效特征进行加权融合,可以分类输出手语词汇.最后在公开的中文手语孤立词数据集上进行实验,在100类词汇和500类词汇分类任务中准确率达到了 95.57%和89.62%.

Abstract

Aiming at the issues of poor information representation ability and incomplete information during the extraction of sign language features,this paper designs a two-stream adaptive enhanced spatial temporal graph convolutional network(TAEST-GCN)for sign language recognition based on isolated words.The network uses human body,hands and face nodes as inputs to construct a two-stream structure based on human joints and bones.The connection between different parts is generated by the adaptive spatial temporal graph convolutional module,ensuring the full utilization of the position and direction informa-tion.Meanwhile,an adaptive multi-scale spatial temporal attention module is built through residual connection to further enhance the convolution ability of the network in both spatial and temporal domain.The effective features extracted from the dual stream network are weighted and fused to classify and output sign language vocabulary.Finally,experiments are carried out on the public Chinese sign language isolated word dataset,achieving accu-racy rates of 95.57%and 89.62%in 100 and 500 categories of words,respectively.

关键词

骨架数据/双流结构/自适应时空图卷积模块/自适应多尺度时空注意力模块/特征融合

Key words

skeleton data/two-stream structure/adaptive spatial temporal graph convo-lutional module/adaptive multi-scale spatial temporal attention module/feature fusion

引用本文复制引用

基金项目

上海市自然科学基金(22ZR1422200)

上海市科委重点基金(19511102803)

上海市产业项目(XTCX-KJ-2022-68)

出版年

2024
应用科学学报
上海大学 中国科学院上海技术物理研究所

应用科学学报

CSTPCDCSCD北大核心
影响因子:0.594
ISSN:0255-8297
参考文献量17
段落导航相关论文