基于属性依存增强的文搜图行人重识别
Text-to-Image Person Reidentification Based on Attribute Dependency Augmentation
夏威 1袁鑫攀1
作者信息
摘要
文搜图行人重识别旨在通过给定的文本从行人图库中检索目标人物,主要挑战的是学习对自由视角(姿势、照明和相机视点)的图像和自由形式的文本具备鲁棒特征.然而,由于在文本描述和行人图像中存在对行人属性挖掘的不足,在细粒度上因细节的差异从而影响了从文本描述到行人图像的检索性能.因此,研究提出了基于属性依存增强的文搜图行人重识别.首先,从文本描述解析出依存关系,并转化为依存矩阵.其次,设计了一个基于自注意力的属性干预模块来融合文本特征和依存矩阵,得到属性增强的文本特征.此时,文本特征经过干预,更为关注属性信息.最后,文本特征与图像特征参与训练,让整个网络对属性的挖掘更为敏感.在两个数据集CUHK-PEDES和ICFG-PEDES上进行实验,证明了模型的有效性.
Abstract
Text-to-Image Person Reidentification(TIPR)aims to retrieve a target person from a pedestrian gal-lery with a given text,and its main challenge is to learn the robust features of free-view(posture,lighting and cam-era viewpoint)images and free-form texts.However,due to the lack of pedestrian attribute mining in text descrip-tions and pedestrian images,the retrieval performance from text descriptions to pedestrian images is affected by dif-ferences in details in fine granularity.Therefore,this study proposes TIPR based on Attribute Dependency Aug-mentation(ADA).Firstly,it analyzes dependencies from text descriptions and transforms them into dependency ma-trixes.Then,it designs an attribute intervention module based on self-attention to fuse text features and depen-dency matrixes and obtains attribute-augmented text features which are more concerned about attribute informa-tion after intervention.Finally,it allows text features and image features participate in training,making the whole network more sensitive to attribute mining.Experiments on two datasets CUHK-PEDES and ICFG-PEDES dem-onstrate the effectiveness of the proposed model.
关键词
文搜图行人重识别/自注意力机制/句法依存/自由视角Key words
Text-to-Image Person Reidentification/Self-attention mechanism/Syntactic dependency/Free view引用本文复制引用
基金项目
湖南省自然科学基金(2022)(2022JJ30231)
湖南省教育厅科研项目(2022)(22B0559)
出版年
2024