首页|Adaptive Pitfall: Exploring the Effectiveness of Adaptation in Skeleton-Based Action Recognition

Adaptive Pitfall: Exploring the Effectiveness of Adaptation in Skeleton-Based Action Recognition

扫码查看
Graph convolution networks (GCNs) have achieved remarkable performance in skeleton-based action recognition by exploiting the adjacency topology of body representation. However, the adaptive strategy adopted by the previous methods to construct the adjacency matrix is not balanced between the performance and the computational cost. We assume this concept of Adaptive Trap, which can be replaced by multiple autonomous submodules, thereby simultaneously enhancing the dynamic joint representation and effectively reducing network resources. To effectuate the substitution of the adaptive model, we unveil two distinct strategies, both yielding comparable effects. (1) Optimization. Individuality and Commonality GCNs (IC-GCNs) is proposed to specifically optimize the construction method of the associativity adjacency matrix for adaptive processing. The uniqueness and co-occurrence between different joint points and frames in the skeleton topology are effectively captured through methodologies like preferential fusion of physical information, extreme compression of multi-dimensional channels, and simplification of self-attention mechanism. (2) Replacement. Auto-Learning GCNs (AL-GCNs) is proposed to boldly remove popular adaptive modules and cleverly utilize human key points as motion compensation to provide dynamic correlation support. AL-GCNs construct a fully learnable group adjacency matrix in both spatial and temporal dimensions, resulting in an elegant and efficient GCN-based model. In addition, three effective tricks for skeleton-based action recognition (Skip-Block, Bayesian Weight Selection Algorithm, and Simplified Dimensional Attention) are exposed and analyzed in this paper. Finally, we employ the variable channel and grouping method to explore the hardware resource bound of the two proposed models. IC-GCN and AL-GCN exhibit impressive performance across NTU-RGB+D 60, NTU-RGB+D 120, NW-UCLA, and UAV-Human datasets, with an exceptional parameter-cost ratio.

SkeletonAdaptation modelsConvolutionAdaptive systemsComputational efficiencyCorrelationTopologyOptimizationAccuracyComputational modeling

Qiguang Miao、Wentian Xin、Ruyi Liu、Yi Liu、Mengyao Wu、Cheng Shi、Chi-Man Pun

展开 >

School of Computer Science and Technology, Xidian University, Xi'an, China

School of Computer Science and Engineering, Xi'an University of Technology, Xi'an, China

Department of Computer and Information Science, Faculty of Science and Technology, University of Macau, Avenida da Universidade, Macau, China

2025

IEEE transactions on multimedia

IEEE transactions on multimedia

ISSN:
年,卷(期):2025.27(1)
  • 64