Action Recognition Network Combining Spatio-Temporal Adaptive Graph Convolution and Transformer
In a human-centered smart factory,perceiving and understanding workers'behavior is crucial,as different job categories are often associated with work time and tasks.In this paper,the accuracy of the model's recognition is improved by combining two approaches,namely adaptive graphs and Transformers,to focus more on the spatiotemporal information of the skeletal structure.Firstly,an adaptive graph method is employed to capture the connectivity relationships beyond the human body skeleton.Furthermore,the Transformer framework is utilized to capture the dynamic temporal variations of the worker's skeleton.To evaluate the model's performance,six typical worker action datasets are created for intelligent production line assembly tasks and validated.The results indicate that the model proposed in this article has a Top-1 accuracy comparable to mainstream action recognition models.Finally,the proposed model is compared with several mainstream methods on the publicly available NTU-RGBD and Skeleton-Kinetics datasets,and the experimental results demonstrate the robustness of the model proposed in this paper.
Intelligent manufacturingRecognition of worker activityDeep learningAdaptive graphTransformer