首页|基于改进双流视觉Transformer的行为识别模型

基于改进双流视觉Transformer的行为识别模型

扫码查看
针对现有行为识别方法中抗背景干扰能力差和准确率低等问题,提出了一种改进的双流视觉Transformer行为识别模型.该模型采用分段采样的方法来增加模型对长时序列数据的处理能力;在网络头部嵌入无参数的注意力模块,在降低动作背景干扰的同时,增强了模型的特征表示能力;在网络尾部嵌入时间注意力模块,通过融合时域高语义信息来充分提取时序特征.文中提出了一种新的联合损失函数,旨在增大类间差异并减少类内差异;采用决策融合层以充分利用光流与RGB流特征.针对上述改进模型,在基准数据集UCF101和HMDB51上进行消融及对比实验,消融实验结果验证了所提方法的有效性,对比实验结果表明,所提方法相比时间分段网络在两个数据集上的准确率分别提高了3.48%和7.76%,优于目前的主流算法,具有较好的识别效果.
Action Recognition Model Based on Improved Two Stream Vision Transformer
To address the issues of poor resistance to background interference and low accuracy in existing action recognition methods,an improved dual stream visual Transformer action recognition model is proposed.The model adopts a segmented sam-pling method to increase its processing ability for long-term sequence data;embedding a parameter free attention module in the network header enhances the model's feature representation ability while reducing action background interference;embedding a temporal attention module at the tail of the network to fully extract temporal features by integrating high semantic information in the time domain.A new joint loss function is proposed in the paper,aiming to increase inter class differences and reduce intra class differences.Adopting a decision fusion layer to fully utilize the features of optical flow and RGB flow.In response to the above improved model,comparative and ablation experiments are conducted on the benchmark datasets UCF101 and HMDB51.The ablation experiment results verify the effectiveness of the proposed method.The comparison results show that the accuracy of the proposed method is 3.48%and 7.76%higher than that of the time segmented network on the two datasets,respectively,which is better than the current mainstream algorithms and has good recognition performance.

Action recognitionVision TransformerSim AM parameter-free attentionTemporal attentionJoint loss

雷永升、丁锰、沈尧、李居昊、赵东越、陈福仕

展开 >

中国人民公安大学侦查学院 北京 100038

中国人民公安大学公共安全行为科学实验室 北京 100038

行为识别 视觉Transformer SimAM无参注意力 时间注意力 联合损失

公安学一流学科培优行动及公共安全行为科学实验室建设项目

2023ZB02

2024

计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCD北大核心
影响因子:0.944
ISSN:1002-137X
年,卷(期):2024.51(7)
  • 1