基于视频关键帧提取的快速T3D动作识别模型

FAST T3D ACTION RECOGNITION METHOD BASED ON VIDEO KEY FRAME EXTRACTION

丁建立 ¹袁梓瑞 ¹王怀超¹

扫码查看

作者信息

1. 中国民航大学计算机科学与技术学院天津 300300;中国民航信息科研基地天津 300300
折叠

摘要

视频级动作识别存在着数据量大、识别速度慢的问题,主要原因是需要提取空间维度上人体姿态,还需要考虑时间维度上动作关联.提出一种基于视频关键帧提取的快速T3D动作识别模型,通过改进的Superpoint网络提取视频关键帧,缩减数据量.以T3D网络为基础,时空分解其关键模块可变时序卷积层,显著提升了其计算效率.在公共数据集UCF-101和HMDB-51数据集进行了实验验证,准确率和原T3D网络近似,但其识别速度为原T3D网络的2倍,更适合于实际的应用场景.

Abstract

The video level action recognition method has the problems of large amount of video input data and slow recognition speed.The main reason is that these methods not only need to extract human posture in the spatial dimension,but also need to consider the association of actions in the temporal dimension.This paperproposes a fast T3D action recognition method based on video key frame extraction.It extracted video key frames through improved Superpoint network to reduce the amount of video data.Based on T3D network,the computational efficiency was improvedthrough spatiotemporal decomposition of its key module variable timing convolution layer.Experimental validation was conducted on the public datasets UCF-101 and HMDB-51.This method's accuracy is similar to the original T3D network,but its recognition speed is twice that of the original T3D network,which is more suitable for practical application scenarios.

关键词

快速动作识别/视频关键帧提取/T3D网络/Superpoint网络/快速识别

Key words

Fast action recognition/Key frame extraction/T3D network/Superpoint network/Fast recognition

引用本文复制引用

基金项目

国家自然科学基金项目(U1833114)

民航安全能力项目(SA2020280)

中央高校基本科研业务费项目(3122019120)

出版年

2024

计算机应用与软件

上海市计算技术研究所上海计算机软件技术开发中心

计算机应用与软件

CSTPCD北大核心

影响因子：0.615

ISSN：1000-386X

参考文献量2

段落导航