基于注意力机制的增强视觉追踪网络
Enhanced visual object tracking network based on attention mechanism
赵安 1张轶1
作者信息
- 1. 四川大学计算机学院,四川 成都 610065
- 折叠
摘要
为提升传统Transformer结构追踪器的性能并解决与注意力机制结合的问题,提出一种具有注意力机制的Trans-former 结构视觉跟踪器(称为EVOTA).提出一个具有局部跨通道的交互策略的通道注意力模块,通过显式建模通道之间的相互依赖关系实现自适应校准通道方向的特征响应.受神经科学理论启发,提出一个能量函数分析神经网络中每个神经元的重要性,进一步推断其相应三维权重.在多个基准数据集上,EVOTA的性能优于多种优秀的追踪器.
Abstract
To improve the tracking performance of traditional Transformer-based tracker and solve the problem of attention mecha-nism combination,a Transformer-based visual tracker with attention mechanism(called EVOTA)was proposed.A channel-wise attention module with local cross-channel interaction strategy was developed to re-calibrate the channel-wise feature responses in an adaptive way by modelling interdependencies explicitly between channels.Inspired by neuroscience theories,an energy func-tion was proposed to analyze the importance of each neuron and infer their 3D weights.On multiple benchmark datasets,EVO-TA outperforms many excellent trackers.
关键词
注意力机制/视觉追踪/Transformer结构/卷积神经网络/深度学习/特征融合/孪生网络Key words
attention mechanism/visual tracking/Transformer structure/convolutional neural network/deep learning/feature fusion/siamese network引用本文复制引用
基金项目
国家自然科学基金区域创新联合基金项目(U20A20161)
出版年
2024