首页|基于平行全维动态注意力机制的视觉地点识别方法

基于平行全维动态注意力机制的视觉地点识别方法

扫码查看
针对天气、季节、光线等环境变化导致的视觉地点识别鲁棒性低的问题,提出了一种提升视觉地点识别特征描述子环境稳健性的多维度注意力机制——平行全维动态注意力机制(POD-Attention)。为实现卷积核在全维度上的动态精细探索,增强特征提取网络对建筑物等不变性特征的提取与学习能力,采用全维动态卷积块在卷积核全维度(输入输出通道、卷积空间和卷积核数量)上添加互补性注意力。将1×1卷积、Skip Squeeze-and-Excitation(SSE)模块与全维动态卷积块平行融合,不仅有效提高了特征提取速率,还扩大了视觉地点识别网络的感受野,进一步提升了视觉地点的识别准确率。在公开数据集上进行的实验表明,基于VGG16及Patch-NetVLAD特征聚合的视觉地点识别方法经POD注意力机制改进后,在 Nordland 与 Mapillary Street-Level Sequences 数据集上的Recall@1 指标提升了9。7%与1。8%,充分证明了本文POD注意力机制对于网络性能的提升效果,也证明了基于本文POD注意力机制的视觉地点识别方法的有效性。
Visual place recognition method based on parallel omni-dimensional dynamic attention mechanism
To address the issue of low robustness in visual place recognition due to environmental changes like weather,season and lighting,we propose a solution called parallel omnidimensional dynamic attention(POD-Attention).In order to achieve dynamic and fine-grained exploration of convolutional kernels across all dimensions and enhance the feature extraction network's ability to capture invariant features like buildings,a complementary attention mechanism is incorporated into the omni-dimensional dynamic convolutional block.This mechanism operates on all dimensions of the convolutional kernels,including input/output channels,convolutional space and kernel quantity,enabling comprehensive attention across the entire kernel space.Furthermore,the parallel fusion of the 1×1 convolution,skip squeeze-and-excitation(SSE)module and omni-dimensional dynamic convolutional block yields notable benefits in terms of both feature extraction speed and the expansion of the receptive field within the visual place recognition network.By combining these components in parallel,the network gains the ability to capture more comprehensive information,resulting in enhanced accuracy for visual place recognition tasks.Experiments conducted on public datasets show that the visual place recognition method based on VGG16 and Patch-NetVLAD feature aggregation improved by the POD attention mechanism,achieves 9.7%increase in Recall@1 on the Nordland dataset and 1.8%increase on the Mapillary Street-Level Sequences dataset.These results demonstrate that the proposed POD attention mechanism effectively enhances the robustness of visual place recognition in different environmental conditions,laying a foundation for more accurate visual localization and map construction in visual SLAM.

visual place recognitionenvironmental robustnessdeep learningparallel omni-dimensional dynamic attentionparallel strategy

刘沛津、刘淑婕、何林、彭莉峻、付雪峰

展开 >

西安建筑科技大学 机电工程学院,陕西 西安 710055

西安建筑科技大学 理学院,陕西 西安 710055

视觉地点识别 环境鲁棒性 深度学习 平行全维动态注意力机制 平行策略

陕西省重点研发计划陕西省教育厅专项科研项目西安建筑科技大学自然科学专项Science and Technology Foundation of Xi'an University of Architecture and Technology

2022GY-13421JK0732ZR19058ZR19059

2024

液晶与显示
中科院长春光学精密机械与物理研究所 中国光学光电子行业协会液晶分会 中国物理学会液晶分会

液晶与显示

CSTPCD北大核心
影响因子:0.964
ISSN:1007-2780
年,卷(期):2024.39(9)