Multi-consistency Constrained Semi-supervised Video Action Detection Based on Feature Enhancement and Residual Reshaping
The feature representations of both original data and augmented data in the consistency regularized semi-supervised video action detection method tend to induce discriminative domain bias between two types of data,thereby resulting in inadequate fitting of the discriminative results.To address this issue,a multi-consistency constrained semi-supervised video action detection method based on feature enhancement and residual reshaping is proposed in this paper.Firstly,the basic action feature descriptors are continuously enhanced and encoded in the spatiotemporal dimension to obtain crucial contextual information for video action understanding.Subsequently,a residual feature reshaping module is employed to obtain multi-scale residual information while reshaping the features.To reduce the discriminative bias between different types of data,multiple consistency constraints are applied to the original data and the augmented data from the perspectives of classification features and action localization features,achieving a match between discriminative results and feature representation of the augmented data and the original data.Experimental results on JHMDB-21 and UCF101-24 datasets demonstrate the effectiveness of the proposed method in improving video action detection accuracy under the condition of limited labeled samples and strong competitiveness.