基于编码视频的动态手势数据优化与识别

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：编码视频数据流中的运动矢量和残差等语法元素可用于替代光流进行运动表示,但其固有的像素噪声和特征稀疏性会影响精细动作的识别精度.对此,在对编码视频语法元素进行数据优化的基础上,设计了一个高精度、低复杂度的动态手势识别框架.首先,提出了关键P帧选择方法,通过选择信息量更高的编码帧解决了特征稀疏性问题;其次,提出了联合残差特征表示方法,利用残差得到精细的手势轮廓图,去除了运动矢量中手部以外的像素噪声;最后,设计了一种轻量而高效的动态手势识别模型,利用优化后的运动矢量和残差获得了类似于光流的计算效果.在viva,sheffield klnect gesture,NvGesture和EgoGesture等数据集上对所提方法进行了验证,实验结果显示,所提方法中仅使用RGB数据模式可达到的识别精度分别为82.94％、99.72％、81.12％和90.48％,降低了89％的存储开销,并且以4.7倍的运行速度获得了与先进方法相近的结果.

外文标题：Dynamic Gesture Data Optimization and Recognition Based on Encoded Video

外文摘要：The syntax elements such as motion vectors ( MVs ) and residuals in encoding video data streams can substitute for optical flow in motion representation.But its inherent pixel noise and feature sparsity may also lead to some errors when fine movements are recognized.Hence, a dynamic gesture recognition framework is designed to get higher-precision and lower-complexity by using the data optimization of syntax elements in coding video.First, a key P-frame selection strategy is introduced to cope with the feature sparsity by selecting encoding frames which cover higher information content.Second, a joint residual feature representation method is proposed to remove the noisy MV not associated with the hand by using finer gesture contour maps obtained from residuals.Finally, a lightweight and efficient dynamic gesture recognition model is designed, leveraging optimized MVs and residuals to achieve a computation effect similar to optical flow.The proposed method is validated on datasets such as Viva dataset, sheffield klnect gesture ( SKIG) dataset, NvGesture dataset, and EgoGesture dataset.The results of the experiments show that while using only RGB data, the recognition accuracy of the method mentioned was 82.94％, 99.72％, 81.12％ and 90.48％ respectively, reducing storage overhead by 89％ and achieving results comparable to the advanced methods with a running speed 4.7 times faster.

外文关键词：

dynamic gesture recognitionencoded videoMotion Vectorresidualdata optimization

作者：

谢晓燕、曹盘宇、夏浩、陈雨馨

展开 >

作者单位：

西安邮电大学计算机学院,西安710121

关键词：

动态手势识别编码视频运动矢量残差数据优化

基金：

科技创新2030-"新一代人工智能"重大项目国家自然科学基金国家自然科学基金

项目编号：

2022ZD01190016183400561772417

出版年：

2024

DOI：

10.13190/j.jbupt.2023-072

北京邮电大学学报

北京邮电大学

北京邮电大学学报

CSTPCD北大核心

影响因子：0.592

ISSN：1007-5321

年,卷(期)：2024.47(2)

参考文献量1