电子与信息学报2024,Vol.46Issue(2) :518-526.DOI:10.11999/JEIT230614

一种结构化双注意力混合通道增强的跨模态行人重识别方法

A Cross-modal Person Re-identification Method Based on Hybrid Channel Augmentation with Structured Dual Attention

庄建军 庄宇辰
电子与信息学报2024,Vol.46Issue(2) :518-526.DOI:10.11999/JEIT230614

一种结构化双注意力混合通道增强的跨模态行人重识别方法

A Cross-modal Person Re-identification Method Based on Hybrid Channel Augmentation with Structured Dual Attention

庄建军 1庄宇辰2
扫码查看

作者信息

  • 1. 南京信息工程大学电子与信息工程学院 南京 210044;南京信息工程大学-中大医院智慧医疗研究院 南京 210044
  • 2. 南京信息工程大学电子与信息工程学院 南京 210044
  • 折叠

摘要

在目前跨模态行人重识别技术的研究中,大部分现有的方法会通过单模态原始可见光图像或者对抗生成图像的局部共享特征来降低跨模态差异,导致在红外图像判别中由于底层特征信息丢失而缺乏稳定的识别准确率.为了解决该问题,该文提出一种结构化双注意力可交换混合随机通道增强的特征融合跨模态行人重识别方法,利用通道增强后的可视图像作为第三模态,通过图像通道可交换随机混合增强(I-CSA)模块对可见光图像进行单通道和三通道随机混合增强抽取,从而突出行人的姿态结构细节,在学习中减少模态间差异.结构化联合注意力特征融合(SAFF)模块在注重模态间行人姿态结构关系的前提下,为跨模态表征学习提供更丰富的监督,增强了模态变化中共享特征的鲁棒性.在SYSU-MM01数据集全搜索模式单摄设置下Rank-1和mAP分别达到71.2%和68.1%,优于同类前沿方法.

Abstract

In the current research on cross-modal person re-identification technology, most existing methods reduce cross-modal differences by using single modal original visible light images or locally shared features of adversarially generated images, resulting in a lack of stable recognition accuracy in infrared image discrimination due to the loss of feature information. In order to solve this problem, A cross-modal person re-identification method based on swappable hybrid random channel augmentation with structured dual attention is proposed.The visual image after channel enhancement is used as the third mode, and the single channel and three channels random hybrid enhancement extraction of visible image is performed through the Image Channel Swappable random mix Augmentation (I-CSA) module, so as to highlight the structural details of pedestrian posture, Reduce modal differences in learning. The Structured joint Attention Feature Fusion (SAFF) module provides richer supervision for cross-modal Feature learning, and enhances the robustness of shared features in modal changes, under the premise of focusing on the structural relationship of pedestrian attitudes between modes. Under the single shot setting of full search mode in the SYSU-MM01 dataset, Rank-1 and mAP reached 71.2%and 68.1%, respectively, surpassing similar cutting-edge methods.

关键词

行人重识别/跨模态/混合通道增强/联合注意力/特征融合

Key words

Person Re-identification/Cross-modal/Hybrid channel enhancement/Joint attention/Feature fusion

引用本文复制引用

基金项目

国家重点研发计划(2021YFE0105500)

国家自然科学基金(62171228)

江苏省高等学校"青蓝工程"项目()

出版年

2024
电子与信息学报
中国科学院电子学研究所 国家自然科学基金委员会信息科学部

电子与信息学报

CSTPCDCSCD北大核心
影响因子:1.302
ISSN:1009-5896
被引量1
参考文献量27
段落导航相关论文