首页|基于多尺度模态融合的RGB-T目标跟踪网络

基于多尺度模态融合的RGB-T目标跟踪网络

扫码查看
可见光-热红外(RGB-T)目标跟踪因受光照条件限制较小受到关注.针对不同尺度特征的分辨率与语义信息存在差异、可见光与热红外两种模态信息不一致的特点,以及现有网络在多模态融合策略上的不足,提出一种RGB-T目标跟踪网络.网络采用孪生结构,首先将主干特征提取网络输出的模板图像特征与搜索图像特征从单尺度拓展到多尺度,并对可见光与热红外模态在不同尺度上分别进行模态融合,然后将得到的融合特征通过注意力机制增强特征表示,最后通过区域建议网络得到预测结果.在 GTOT 与 RGBT-234 两个公开 RGB-T 数据集上的实验结果表明,该网络跟踪精度和成功率较高,可以应对复杂的跟踪场景,相比于其他网络具有更高的跟踪性能.
RGB-T object tracking network based on multi-scale modality fusion
RGB-T(RGB-Thermal)object tracking has received much attention in the field of object tracking because it is less restricted by lighting conditions.An RGB-T object tracking network was proposed to address the differences in resolution and semantic information of features at different scales,the inconsistency between visible and thermal infrared modal information,and the shortcomings of existing networks in multimodal fusion strategies.The network adopted Siamese structure to expand the template image features and search image features output by the backbone feature extraction network from single scale to multiple scales.The modal fusion for visible and thermal infrared modalities at different scales was performed separately.Then the obtained fused features were enhanced by the attention mechanism to enhance the feature representation.Finally,the prediction results were obtained by the region suggestion network.The experimental results on two publicly available RGB-T datasets,GTOT and RGBT-234,show that the network,with high tracking precision and success rate,can cope with complex tracking scenarios and has higher tracking performance compared with other networks.

object trackingRGB-Thermalmulti-scale featuresmodality fusiondeep learning

程竹轩、范慧杰、唐延东、王强

展开 >

沈阳化工大学 信息工程学院,辽宁 沈阳 110142

中国科学院沈阳自动化研究所 机器人学国家重点实验室,辽宁 沈阳 110016

中国科学院 机器人与智能制造创新研究院,辽宁 沈阳 110016

沈阳大学 辽宁省装备制造综合自动化重点实验室,辽宁 沈阳 110044

展开 >

目标跟踪 可见光与热红外 多尺度特征 模态融合 深度学习

国家自然科学基金国家自然科学基金联合基金重点支持项目

62273339U20A20200

2024

山东科技大学学报(自然科学版)
山东科技大学

山东科技大学学报(自然科学版)

CSTPCD北大核心
影响因子:0.437
ISSN:1672-3767
年,卷(期):2024.43(1)
  • 2