Improved target tracking algorithm based on Swin-Transformer
An improved target tracking algorithm is proposed based on the Swin-Transformer network to address the problem of insufficient feature extraction capability and poor tracking effect often encountered when using convolutional neural networks in deep learning-based target tracking methods.Firstly,the window attention mechanism of the Swin-Transformer is enhanced across multiple scales,and a multi-scale window module termed MW-MSA is devised to extract more comprehensive local detail information.This augmentation,in conjunction with global contextual insights,engenders multi-scale discriminative features.Then,these features are integrated with the encoding-decoding structure of the Transformer,serving as the feature fusion network.An optimized multi-layer perceptron is employed as the update score judgment network to establish the state awareness module.Finally,a multi-tracker fusion method is proposed to address challenges like target occlusion and disappearance by integrating an improved tracking algorithm with the SuperDiMP tracking algorithm.Results from testing on L-LaSOT and GOT-10K datasets show significant improvements over the STARK tracking algorithm:a 2.7%increase in average overlap rate(AO)and a 3.3%increase in success rate(SR)on GOT-10K,and a 0.8%increase in success rate(AUC)on L-LaSOT.Moreover,under the target disappearance challenge,the success rate is improved by 1%.