基于特征融合的可视-红外行人重识别算法

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：由于智能监控设备的快速发展和部署,产生的海量监控数据难以通过传统人力处理.加之近年来RGB-IR双模相机的广泛应用,红外监控视频数据得以利用来辅助相关视觉任务.为了更准确地在可视-红外两种模态监控视频中检索相同行人,提出了基于特征融合的可视-红外行人重识别算法.该算法首先设计了基于Transformer方法的特征提取器,从两种模态数据中生成具有判别力的特征.然后考虑对两种模态互补信息的使用,提出双向多模态注意力方法对齐不同模态的特征,并同时融合互补的语义信息,最终通过分类器进行分类识别.在公开数据集进行实验表明,所提算法相对于目前大多数已有算法具有更好的泛化能力和鲁棒性,在SYSU-MM01数据集上的预测精度达到99.86％,在LLCM数据集上的预测准确率达到94.13％.

外文标题：Multi-modal Feature Fusion for Visible-Infrared Person Re-Identification

外文摘要：Due to the rapid development and deployment of intelligent monitoring devices,the massive mo-nitoring data generated is difficult to process throughhuman efforts.In addition,with the wide application of RGB-IR dual-mode cameras in recent years,infrared surveillancedata has been utilized to assist related computer visual tasks.In order to more accurately retrievethe same pedestrians in visible-infrared dual-modal surveillance videos,a visual-infrared person re-identification algorithm based on feature fusion is proposed.The algorithm first designs a Transformer-basedfeature extractorto generate discriminative features from two modalities of ima-ges.Then,considering the use of complementary information between the two modalities,a bidirectional multi-modal attention method is proposed to align the features of different modalities and simultaneously fuse comple-mentary semantic information.Experiments on public datasets have shown that the proposed algorithm has better generalization ability and robustness compared to most existing algorithms.The prediction accuracy on the SYSU-MM01 dataset reaches 99.86％,and the prediction accuracy on the LLCM dataset reaches 94.13％.

外文关键词：

visible-infrared person ReIDdeep learningcross-modal learningcomputer vision

作者：

申汶山、王洁、黄琴

展开 >

作者单位：

山西师范大学数学与计算机科学学院,山西太原 030031

山西大学大数据科学与产业研究院,山西太原 030000

关键词：

可视-红外行人重识别深度学习跨模态学习计算机视觉

出版年：

2024

山西师范大学学报(自然科学版)

山西师范大学

山西师范大学学报(自然科学版)

影响因子：0.512

ISSN：1009-4490

年,卷(期)：2024.38(1)

参考文献量23