基于注意力机制的增强视觉追踪网络

Enhanced visual object tracking network based on attention mechanism

赵安 ¹张轶¹

扫码查看

作者信息

1. 四川大学计算机学院,四川成都 610065
折叠

摘要

为提升传统Transformer结构追踪器的性能并解决与注意力机制结合的问题,提出一种具有注意力机制的Trans-former 结构视觉跟踪器(称为EVOTA).提出一个具有局部跨通道的交互策略的通道注意力模块,通过显式建模通道之间的相互依赖关系实现自适应校准通道方向的特征响应.受神经科学理论启发,提出一个能量函数分析神经网络中每个神经元的重要性,进一步推断其相应三维权重.在多个基准数据集上,EVOTA的性能优于多种优秀的追踪器.

Abstract

To improve the tracking performance of traditional Transformer-based tracker and solve the problem of attention mecha-nism combination,a Transformer-based visual tracker with attention mechanism(called EVOTA)was proposed.A channel-wise attention module with local cross-channel interaction strategy was developed to re-calibrate the channel-wise feature responses in an adaptive way by modelling interdependencies explicitly between channels.Inspired by neuroscience theories,an energy func-tion was proposed to analyze the importance of each neuron and infer their 3D weights.On multiple benchmark datasets,EVO-TA outperforms many excellent trackers.

关键词

注意力机制/视觉追踪/Transformer结构/卷积神经网络/深度学习/特征融合/孪生网络

Key words

attention mechanism/visual tracking/Transformer structure/convolutional neural network/deep learning/feature fusion/siamese network

引用本文复制引用

基金项目

国家自然科学基金区域创新联合基金项目(U20A20161)

出版年

2024

计算机工程与设计

中国航天科工集团二院706所

计算机工程与设计

CSTPCD北大核心

影响因子：0.617

ISSN：1000-7024

段落导航