基于改进双流视觉Transformer的行为识别模型

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：针对现有行为识别方法中抗背景干扰能力差和准确率低等问题,提出了一种改进的双流视觉Transformer行为识别模型.该模型采用分段采样的方法来增加模型对长时序列数据的处理能力;在网络头部嵌入无参数的注意力模块,在降低动作背景干扰的同时,增强了模型的特征表示能力;在网络尾部嵌入时间注意力模块,通过融合时域高语义信息来充分提取时序特征.文中提出了一种新的联合损失函数,旨在增大类间差异并减少类内差异;采用决策融合层以充分利用光流与RGB流特征.针对上述改进模型,在基准数据集UCF101和HMDB51上进行消融及对比实验,消融实验结果验证了所提方法的有效性,对比实验结果表明,所提方法相比时间分段网络在两个数据集上的准确率分别提高了3.48％和7.76％,优于目前的主流算法,具有较好的识别效果.

外文标题：Action Recognition Model Based on Improved Two Stream Vision Transformer

外文摘要：To address the issues of poor resistance to background interference and low accuracy in existing action recognition methods,an improved dual stream visual Transformer action recognition model is proposed.The model adopts a segmented sam-pling method to increase its processing ability for long-term sequence data;embedding a parameter free attention module in the network header enhances the model's feature representation ability while reducing action background interference;embedding a temporal attention module at the tail of the network to fully extract temporal features by integrating high semantic information in the time domain.A new joint loss function is proposed in the paper,aiming to increase inter class differences and reduce intra class differences.Adopting a decision fusion layer to fully utilize the features of optical flow and RGB flow.In response to the above improved model,comparative and ablation experiments are conducted on the benchmark datasets UCF101 and HMDB51.The ablation experiment results verify the effectiveness of the proposed method.The comparison results show that the accuracy of the proposed method is 3.48％and 7.76％higher than that of the time segmented network on the two datasets,respectively,which is better than the current mainstream algorithms and has good recognition performance.

外文关键词：

Action recognitionVision TransformerSim AM parameter-free attentionTemporal attentionJoint loss

作者：

雷永升、丁锰、沈尧、李居昊、赵东越、陈福仕

展开 >

作者单位：

中国人民公安大学侦查学院北京 100038

中国人民公安大学公共安全行为科学实验室北京 100038

关键词：

行为识别视觉Transformer SimAM无参注意力时间注意力联合损失

基金：

公安学一流学科培优行动及公共安全行为科学实验室建设项目

项目编号：

2023ZB02

出版年：

2024

DOI：

10.11896/jsjkx.230500054

计算机科学

重庆西南信息有限公司（原科技部西南信息中心）

计算机科学

CSTPCD北大核心

影响因子：0.944

ISSN：1002-137X

年,卷(期)：2024.51(7)

参考文献量1