基于文本图像互学习的换衣行人重识别方法

扫码查看

原文链接

万方数据
维普

中文摘要：针对行人重识别在换衣场景下的小数据集样本中识别精度较低的问题,结合大模型CLIP(Contrastive Lan-guage-Image Pre-training)生成伪文本的功能,提出基于文本图像互学习的换衣行人重识别方法.在训练第一阶段,设计伪文本生成器,交换同批次中的样本像素,生成多样性文本,增强文本差异性,并通过语义对齐损失约束文本特征的一致性.在第二阶段,设计局部全局融合网络,融合局部特征和全局特征,在第一阶段文本信息的指导下,增强视觉特征的判别性.在PRCC、Celeb-ReID、Celeb-Light、VC-Clothes数据集上的实验表明,文中方法可提升在小数据集样本中的性能.

外文标题：Clothes-Changing Person Re-identification Method Based on Text-Image Mutual Learning

外文摘要：To address the issue of low recognition accuracy in pedestrian re-identification(Re-ID)tasks involving clothing changes,a method for clothes-changing person re-identification based on text-image mutual learning(TIML)is proposed.It leverages the ability of contrastive language-image pre-training to generate pseudo-texts.In the first training phase,a pseudo-text generator is designed to enhance text diversity by swapping pixel information among samples within the same batch,thereby augmenting text variability.Additionally,a semantic alignment loss LSA is introduced to ensure the consistency in text feature representation.In the second phase of training,a global and local fusion network is devised to bolster the discriminative power of visual features by fusing local and global features,guided by the textual information obtained in the first phase.Experiments on PRCC,Celeb-ReID,Celeb-Light and VC-Clothes datasets demonstrate that the proposed model significantly improves recognition accuracy in scenarios with small dataset samples.

外文关键词：

Clothes-Changing Person Re-identificationContrastive Language-Image Pre-training(CLIP)Modal InteractionSemantic AlignmentPrompt Engineering

作者：

葛斌、卢洋、夏晨星、官骏鸣

展开 >

作者单位：

安徽理工大学计算机科学与工程学院淮南 232001

黄山学院信息工程学院黄山 245041

关键词：

换衣行人重识别 CLIP 模态交互语义对齐提示工程

出版年：

2024

DOI：

10.16451/j.cnki.issn1003-6059.202411002

模式识别与人工智能

中国自动化学会,国家智能计算机研究开发中心,中国科学院合肥智能机械研究所

模式识别与人工智能

CSTPCD北大核心

影响因子：0.954

ISSN：1003-6059

年,卷(期)：2024.37(11)