首页|面向视频行为识别深度模型的数据预处理方法

面向视频行为识别深度模型的数据预处理方法

Data Preprocessing Method for Video Action Recognition Depth Models

扫码查看
以视频帧采样和数据增强为代表的预处理操作是提升视频行为识别深度模型性能的重要手段.针对现有视频数据预处理存在的采样视频帧区分性不足、数据增强方式单一等问题,提出一种面向视频行为识别深度模型的数据预处理方法.在视频帧采样上设计动作指导的片段化视频采样策略,综合考虑视频帧间差异特征与视频片段短期时序特征,通过显著行为动作获取关键视频帧并对其邻近视频帧进行采样,有效提高所选取视频帧的时空区分能力.借鉴图像分类中的随机数据增强方法,以随机数据增强方式对采样后视频短片段进行数据增强处理,使视频识别深度模型学习到更复杂的空间变化信息.根据2个公开的视频识别数据集和2个代表性的网络模型的评估实验结果表明,所提预处理方法可以使基准模型获得2.5个百分点以上的准确率提升,最高可提升6.8个百分点.上述实验结果验证了所提预处理方法在视频行为识别任务中的有效性.
Video preprocessing operations,mainly including video frame sampling and data augmentation,are essential methods to improving the performance of deep video models in action recognition,and they have recently received increased attention.In this study,a novel data preprocessing method for video action recognition depth models is proposed,with a focus on video preprocessing problems of insufficient guidance during key frame sampling and relatively simple data amplification methods.First,a novel motion-guided fragmented video sampling technique is designed,which comprehensively considers features among different video frames and short-term timing features of video clips.It acquires key video frames guided by significant motion actions and sampling adjacent video frames of these key frames,which effectively improves the spatiotemporal discrimination capacity of the selected video frames.Additionally,motivated by the random data augmentation successfully applied in image classification task,this study further introduces the random data augmentation strategy to augment the sampled short video clips.This ensures that video recognition depth models can learn more complex spatially varying information.Based on evaluation experiments using two public video recognition datasets and two representative network models,the results show that the proposed preprocessing method can improve the accuracy of the baseline model by more than 2.5 percentage points,with the highest improvement accuracy of 6.8 percentage points.The results demonstrate the effectiveness of the method in video action recognition.

video action recognitionpreprocessing methodmotion-guided fragmented video samplingdata aug-mentationdeep learning

安峰民、张冰冰、董微、张建新

展开 >

大连民族大学计算机科学与工程学院,辽宁 大连 116650

大连理工大学信息与通信工程学院,辽宁 大连 116024

视频行为识别 预处理方法 动作指导的片段化视频采样 数据增强 深度学习

国家自然科学基金辽宁省应用基础研究计划项目辽宁省应用基础研究计划项目

619720622023JH2/1013001912023JH2/101300193

2024

计算机工程
华东计算技术研究所 上海市计算机学会

计算机工程

CSTPCD北大核心
影响因子:0.581
ISSN:1000-3428
年,卷(期):2024.50(2)
  • 3