To address the issue of low recognition accuracy in pedestrian re-identification(Re-ID)tasks involving clothing changes,a method for clothes-changing person re-identification based on text-image mutual learning(TIML)is proposed.It leverages the ability of contrastive language-image pre-training to generate pseudo-texts.In the first training phase,a pseudo-text generator is designed to enhance text diversity by swapping pixel information among samples within the same batch,thereby augmenting text variability.Additionally,a semantic alignment loss LSA is introduced to ensure the consistency in text feature representation.In the second phase of training,a global and local fusion network is devised to bolster the discriminative power of visual features by fusing local and global features,guided by the textual information obtained in the first phase.Experiments on PRCC,Celeb-ReID,Celeb-Light and VC-Clothes datasets demonstrate that the proposed model significantly improves recognition accuracy in scenarios with small dataset samples.
关键词
换衣行人重识别/CLIP/模态交互/语义对齐/提示工程
Key words
Clothes-Changing Person Re-identification/Contrastive Language-Image Pre-training(CLIP)/Modal Interaction/Semantic Alignment/Prompt Engineering