Lightweight action detection method based on collective convolution
Spatio-temporal behavior detection is an important research direction in the field of computer vision.To reduce the size of models and improve the detection speed,we propose a lightweight action detection method based on collective convolution(CoConv).The temporal information of the video is converted into spatio-temporal image(STI),and the spa-tio-temporal feature information of the same position at different times is obtained by Collective Convolution.Based on YOLOv5,the backbone network and detection head are replaced by collective convolution modules to construct the spatio-temporal action detection network structure.Detection results of spatio-temporal images are connected through post-process-ing to quickly form video results and improve the performance of network action detection.Experimental results demonstrate that our method can reduce network computational complexity,enhance detection speed,and outperform existing behavior detection methods while maintaining accuracy and not increasing the number of parameters.
deep learningspatiotemporal action detectionlightweightcollective convolutionspatio-temporal image