Vehicle re-identification based on Vision Transformer and convolution injection
Aiming at solving the problem of low robustness of feature extraction in vehicle re-identification,this paper proposes a vehicle re-identification method based on Vision Transformer.Combined with the auxiliary embedding mod-ule,the object-guiding projection module is proposed by using the attention mechanism to suppress the noise caused by different viewpoints,camera shots,and redundant backgrounds.Besides,based on Vision Transformer's long-distance modeling capabilities,a channel-aware module is proposed to build the association between patches and channels through parallel connections,which captures the discriminative features between patches and channels simultaneously.Finally,according to the local inductive bias of convolutional neural network(CNN),the global feature vector is input into the convolution injection module to extract local features,and jointly optimized with global features to construct ro-bust vehicle features.In order to verify the effectiveness of the proposed method,experiments were carried out on the Ve-Ri776,VehicleID and VeRi-Wild datasets.The experimental results prove that the method has achieved good results.