Human Action Recognition Method Based on Multi-channel Fusion
Objective With the continuous development of science and technology,cutting-edge advancements such as artificial intelligence and deep learn-ing increasingly penetrate various fields,significantly improving social productivity.Among these,WiFi-based human action recognition has emerged as a prominent research direction,demonstrating essential application potential in smart homes,health care,military training,and other fields.However,with the diversified development of wireless communication technology,human action recognition faces new challenges,partic-ularly in the expanding applications of multiple-input multiple-output(MIMO)systems.This necessitates in-depth research and innovation to en-sure that human action recognition technology adapts to the diverse communication environments of the future.A MIMO system's multivariate spatial diversity characteristics provide higher data rates and improved signal quality due to its design for parallel transmission through multiple channels.However,in practical applications,multi-channel parallel transmission often encounters interference from the multipath effect,causing the signal arriving at the receiving antenna to exhibit complex fluctuation characteristics with varying path lengths and incident angles.Since this path information is embedded in channel state information(CSI),the characteristics of CSI differ for the same action,while different actions may exhibit remarkably similar CSI characteristics.This results in incompleteness and generalization issues in feature extraction and action classifica-tion processes.Therefore,designing an effective mechanism to extract and classify human actions in complex MIMO environments is critical.This mechanism must overcome the multipath effect in multi-channel transmission to ensure accuracy and consistency when extracting action fea-tures.In this challenging context,innovative algorithms and model designs are crucial for addressing differences in CSI features between various actions and enhancing the model's generalization ability and robustness.Method This study explores human actions'physical and MIMO transmission characteristics,proposing a deep learning-based human action re-cognition method that employs a dual attention mechanism and a multi-channel,multi-scale fusion temporal convolution network to address the above challenges.Initially,a multi-channel information extraction model is constructed to leverage the spatial diversity inherent in MIMO sys-tems.This model enhances the representation of data received for specific actions across different channels by extracting action-specific character-istic information from each antenna's received signals.Simultaneously,a multi-scale integration mechanism is applied to fuse action characterist-ics at varying scales,bolstering the system's ability to represent actions effectively.The extracted action features are then aggregated through a dual attention mechanism.The feature map fusion attention mechanism mines the correlation between action features from each channel,assign-ing higher weights to more relevant features and thereby enhancing the discriminative power of the extracted action representation.A temporal convolution network captures temporal dependencies within the extracted action features,distinguishing between actions with similar spatial char-acteristics but distinct temporal patterns.Compared to the feature map fusion attention mechanism,the feature channel attention mechanism as-signs weights to different channels based on their importance for action recognition.This design allows the model to prioritize features signific-antly contributing to action recognition,enhancing its overall recognition capabilities.A temporal convolutional network processes the extracted action features.This network performs convolution operations on features at different instances,enabling the model to capture changes over time and identify long-term dependencies between action features.This capability is crucial for accurately recognizing complex and continuous actions.A global average pooling(GAP)layer is implemented to bridge the gap between the extracted feature maps and the action classifier.This opera-tion preserves the action characteristics of each channel while facilitating a global comparison of these characteristics.Balancing the characterist-ics of each channel further improves the accuracy of action recognition.Results and Discussion Comprehensive experiments are conducted on public datasets and in real-world environments to evaluate the effective-ness of the proposed model.These experiments assess the model's performance under controlled and uncontrolled conditions,ensuring its robust-ness and adaptability to practical scenarios.In the public dataset evaluation,the proposed action recognition model achieves an exceptional accur-acy of 98.72%in identifying seven distinct human actions,surpassing the recognition performance of traditional models.This result highlights the model's effectiveness in distinguishing different actions in controlled settings.Tests are also conducted in various natural settings to validate the model's adaptability to real-world environments,including self-built laboratories,classrooms,and corridors.These environments present chal-lenges such as uncontrolled lighting conditions,background noise,and varying distances between the user and the WiFi receiver.Despite these challenges,the proposed model maintains high performance,achieving accuracy rates of 97.94%,97.28%,and 95.66%,respectively,for ten dif-ferent actions in these real-world environments.These results demonstrate the model's robustness and adaptability to real-world scenarios,mak-ing it a promising tool for practical applications.WiFi-based human action recognition offers significant potential for transforming domains such as healthcare,smart homes,and human-computer interaction.In healthcare,real-time action recognition enhances patient care,detects potential falls,and assists elderly individuals.Smart homes can evolve into intelligent living spaces,automatically adjusting conditions based on occupant activities.Human-computer interaction can also be revolutionized,enabling smoother and more natural interactions with emerging technologies.The development of this model provides reliable and intelligent human action recognition solutions for practical applications,fostering the deep integration of technology and society.This integration promotes innovation and paves the way for a connected and intelligent future.Conclusion The novel human action recognition model proposed in this study,which employs a dual attention mechanism and a multi-channel,multi-scale temporal convolutional network,represents a significant breakthrough in addressing the limitations of human action recognition in wireless environments.The model achieves remarkable accuracy in diverse environments by effectively capturing human actions'spatial and tem-poral characteristics from channel state information(CSI)data.This study holds substantial theoretical value and practical guiding significance,paving the way for future advancements in human action recognition research.In future studies,further optimization and refinement of the pro-posed model based on feedback from various natural environments will enhance its adaptability and generalizability,enabling seamless integra-tion with human activities and revolutionizing multiple domains,including healthcare,smart homes,and human-computer interaction.
action recognitiondeep learningchannel state informationTCNattention