Multi-modal Feature Fusion for Visible-Infrared Person Re-Identification
Due to the rapid development and deployment of intelligent monitoring devices,the massive mo-nitoring data generated is difficult to process throughhuman efforts.In addition,with the wide application of RGB-IR dual-mode cameras in recent years,infrared surveillancedata has been utilized to assist related computer visual tasks.In order to more accurately retrievethe same pedestrians in visible-infrared dual-modal surveillance videos,a visual-infrared person re-identification algorithm based on feature fusion is proposed.The algorithm first designs a Transformer-basedfeature extractorto generate discriminative features from two modalities of ima-ges.Then,considering the use of complementary information between the two modalities,a bidirectional multi-modal attention method is proposed to align the features of different modalities and simultaneously fuse comple-mentary semantic information.Experiments on public datasets have shown that the proposed algorithm has better generalization ability and robustness compared to most existing algorithms.The prediction accuracy on the SYSU-MM01 dataset reaches 99.86%,and the prediction accuracy on the LLCM dataset reaches 94.13%.
visible-infrared person ReIDdeep learningcross-modal learningcomputer vision