首页|基于掩模重构与动态注意力的跨模态行人重识别

基于掩模重构与动态注意力的跨模态行人重识别

扫码查看
跨模态行人重识别是一项具有挑战性的行人检索任务.现有研究侧重于通过提取模态共享特征来减小模态间差异,忽视了对模态内差异和背景干扰的处理.为此,提出了一种掩模重构与动态注意力(MRDA)网络,该网络通过重构人体区域特征来消除背景杂波的影响,从而增强网络对背景变化的鲁棒性.此外,该网络结合了动态注意力机制,以过滤无关信息,动态挖掘并增强具有辨别力的特征表示,消除模态内差异的影响.实验结果显示:该网络在SYSU-MM01数据集的all-search模式下的第一个检索结果匹配成功的概率(Rank-1)和均值平均精度(mAP)分别达到 70.55%和63.89%;在RegDB数据集的visible-to-infrared检索模式下的Rank-1和mAP分别达到91.80%和82.08%.在公共数据集上验证了所提方法的有效性.
Cross-Modal Person Re-Identification Based on Mask Reconstruction with Dynamic Attention
Cross-modal person re-identification is a challenging pedestrian retrieval task.Existing research focuses on reducing inter-modal differences by extracting modal shared features,while ignoring the processing of intra-modal differences and background interference.In this regard,a mask reconstruction and dynamic attention(MRDA)network is proposed to eliminate the influence of background clutter by reconstructing the features of human body regions,thereby enhancing the robustness of the network on background changes.In addition,the dynamic attention mechanism is combined to filter irrelevant information,dynamically mine and enhance the discriminating feature representations,and eliminate the influence of intra-modal differences.The experimental results show that the probability the first search result matches successfully(Rank-1)and mean average precision(mAP)in the all-search mode of the SYSU-MM01 dataset reach 70.55%and 63.89%,respectively.The Rank-1 and mAP in the visible-to-infrared retrieval mode of the RegDB dataset reach 91.80%and 82.08%,respectively.The effectiveness of the proposed method is verified on the public datasets.

person re-identificationcross-modalitymask reconstructiontwo-stream networkdynamic attention

张阔、范馨月、李嘉辉、张干

展开 >

重庆邮电大学通信与信息工程学院,重庆 400065

行人重识别 跨模态 掩模重构 双流网络 动态注意力

国家自然科学基金

62271096

2024

激光与光电子学进展
中国科学院上海光学精密机械研究所

激光与光电子学进展

CSTPCD北大核心
影响因子:1.153
ISSN:1006-4125
年,卷(期):2024.61(10)
  • 30