计算机辅助设计与图形学学报2024,Vol.36Issue(7) :1065-1076.DOI:10.3724/SP.J.1089.2024.19886

多模态特征融合和自蒸馏的红外-可见光行人重识别

Infrared-Visible Person Re-Identification via Multi-Modality Feature Fusion and Self-Distillation

万磊 李华锋 张亚飞
计算机辅助设计与图形学学报2024,Vol.36Issue(7) :1065-1076.DOI:10.3724/SP.J.1089.2024.19886

多模态特征融合和自蒸馏的红外-可见光行人重识别

Infrared-Visible Person Re-Identification via Multi-Modality Feature Fusion and Self-Distillation

万磊 1李华锋 1张亚飞1
扫码查看

作者信息

  • 1. 昆明理工大学信息工程与自动化学院 昆明 650500
  • 折叠

摘要

现有跨模态行人重识别方法大多挖掘模态不变的特征,忽略了不同模态内的具有判别性的自有特征.为了充分地利用不同模态内的自有特征,提出一种多模态特征融合和自蒸馏的红外-可见光行人重识别方法.首先提出一种基于双分类器的注意力融合机制,为各模态的自有特征赋予较大的融合权重,共有特征赋予较小的融合权重,得到含有各模态判别性自有特征的多模态融合特征;为了提升网络特征的鲁棒性以适应行人外观的变化,构建一个记忆存储器来存储行人的多视角特征;还设计了一种自蒸馏无参数动态引导策略,在多模态融合特征和多视角特征的引导下,利用该策略动态强化网络的多模态推理和多视角推理能力;最后网络能够从一个行人的单模态图像推理出另一模态不同视角行人特征,提升模型跨模态行人重识别的性能.基于PyTorch深度学习框架,在公开数据集SYSU-MM01和RegDB上与当前主流的方法进行对比实验,结果表明,所提方法的Rank-1分别达到63.12%和92.55%,mAP分别达到61.51%和89.55%,优于对比方法.

Abstract

Most existing cross-modality person re-identification methods mine modality-invariant features,while ignoring the discriminative features inherent to each modality.To fully utilize the inherent features in different modalities,an infrared-visible person re-identification method via multi-modality feature fusion and self-distillation is proposed.Firstly,an attention fusion mechanism based on a dual classifier is proposed.This mechanism assigns greater fusion weights to the self-owned features of each modality,while conversely as-signing lesser weights to the common features.This approach aims to obtain multi-modality fusion features that encapsulate the discriminative self-owned features of each modality.To enhance the robustness of network feature in adjusting to changes of pedestrian appearance,a memory storage is constructed to store the multi-view features of pedestrians.A parameter-free dynamic guidance strategy for self-distillation is also de-signed.This strategy aims to dynamically reinforce the multi-modality and multi-view reasoning capabilities of the network under the guidance of multi-modality fusion features and multi-view features.Finally,the network is able to infer the features of a pedestrian with different views of another modality from its single-modality image,thus improving the performance of the model for cross-modality person re-identification.Based on the PyTorch deep learning framework,comparative experiments are conducted with current main-stream methods on the public datasets SYSU-MM01 and RegDB.The results demonstrate that the proposed method achieves Rank-1 accuracies of 63.12%and 92.55%,respectively,along with mAP scores of 61.51%and 89.55%,re-spectively,which is superior to the comparison methods.

关键词

跨模态行人重识别/特征融合/注意力机制/记忆存储机制/自蒸馏

Key words

cross-modality person re-identification/feature fusion/attention mechanism/memory storage mecha-nism/self-distillation

引用本文复制引用

基金项目

国家自然科学基金(62161015)

国家自然科学基金(61966021)

出版年

2024
计算机辅助设计与图形学学报
中国计算机学会

计算机辅助设计与图形学学报

CSTPCDCSCD北大核心
影响因子:0.892
ISSN:1003-9775
参考文献量3
段落导航相关论文