Dynamic Gesture Data Optimization and Recognition Based on Encoded Video
The syntax elements such as motion vectors ( MVs ) and residuals in encoding video data streams can substitute for optical flow in motion representation.But its inherent pixel noise and feature sparsity may also lead to some errors when fine movements are recognized.Hence, a dynamic gesture recognition framework is designed to get higher-precision and lower-complexity by using the data optimization of syntax elements in coding video.First, a key P-frame selection strategy is introduced to cope with the feature sparsity by selecting encoding frames which cover higher information content.Second, a joint residual feature representation method is proposed to remove the noisy MV not associated with the hand by using finer gesture contour maps obtained from residuals.Finally, a lightweight and efficient dynamic gesture recognition model is designed, leveraging optimized MVs and residuals to achieve a computation effect similar to optical flow.The proposed method is validated on datasets such as Viva dataset, sheffield klnect gesture ( SKIG) dataset, NvGesture dataset, and EgoGesture dataset.The results of the experiments show that while using only RGB data, the recognition accuracy of the method mentioned was 82.94%, 99.72%, 81.12% and 90.48% respectively, reducing storage overhead by 89% and achieving results comparable to the advanced methods with a running speed 4.7 times faster.