FAST T3D ACTION RECOGNITION METHOD BASED ON VIDEO KEY FRAME EXTRACTION
The video level action recognition method has the problems of large amount of video input data and slow recognition speed.The main reason is that these methods not only need to extract human posture in the spatial dimension,but also need to consider the association of actions in the temporal dimension.This paperproposes a fast T3D action recognition method based on video key frame extraction.It extracted video key frames through improved Superpoint network to reduce the amount of video data.Based on T3D network,the computational efficiency was improvedthrough spatiotemporal decomposition of its key module variable timing convolution layer.Experimental validation was conducted on the public datasets UCF-101 and HMDB-51.This method's accuracy is similar to the original T3D network,but its recognition speed is twice that of the original T3D network,which is more suitable for practical application scenarios.
Fast action recognitionKey frame extractionT3D networkSuperpoint networkFast recognition