首页|融合感知卷积和注意力机制的全景图深度估计方法

融合感知卷积和注意力机制的全景图深度估计方法

扫码查看
全景图像凭借广阔视野和完整场景信息,以及能够全面捕捉场景的空间结构和细节,提供准确细致的深度信息,在自动驾驶、机器人导航领域备受关注。传统全景图像的投影方式,如等距柱状投影(ERP)提供了丰富的场景视野,但存在失真和无法获取完整上下文信息的问题;而立方体投影(CMP)虽然避免了失真,但是视野小以及边界不连续。针对所存在的问题,该文提出了一种结合等距失真感知卷积和混合注意力机制的双分支全景图像深度估计方法——DeCSBAFuse。该方法的主要创新:提出基于等距失真感知卷积的EDeConv分支,融入垂直方向上的图像信息,充分保留上下文信息;基于通道和空间注意力机制,引入批次注意力机制设计的CSBA分支,以更好地处理CMP不同面之间的边界问题。相较于主流的全景图像深度估计方法,DeCSBAFuse在三个基准数据集上展现出更准确的深度预测结果,尤其在细致纹理区域能够预测出更清晰的深度边界。
Depth Estimation in Panoramic Images:Distortion-aware Convolution and Hybrid Attention
Panoramic images,with their wide field of view and comprehensive scene information,including spatial structure and details,provide accurate and detailed depth information and have gained significant attention in the fields of autonomous driving and robot navigation.Traditional panoramic image projection methods,such as Equirectangular Projection(ERP),offer a rich scene view but suffer from distortion and the inability to capture complete contextual information,while Cubemap Projection(CMP)avoids distortion,it has a limited field of view and discontinuous boundaries.To address these issues,we propose a dual-branch panoramic image depth estimation method called DeCSBAFuse,combining equirectangular distortion aware convolution and a hybrid attention mechanism.The key innovations of the proposed method are as follows:introducing an ERP branch based on EDPC,integrating vertical image information to fully preserve contextual information;designing a CMP branch based on channel and spatial attention mechanisms,incorporating batch attention to better handle boundary issues between different faces of CMP.Compared to mainstream panoramic image depth estimation methods,DeCSBAFuse demonstrates more accurate depth prediction results on three benchmark datasets,especially in areas with detailed textures,where it can predict clearer depth boundaries.

panoramic imagedepth estimationattention mechanismgeometric projectiondistortion-aware convolution

朱哲、黄莉、王宗阳

展开 >

武汉科技大学计算机科学与技术学院,湖北 武汉 430065

湖北省智能信息处理与实时工业系统重点实验室,湖北 武汉 430065

全景图像 深度估计 注意力机制 几何投影 失真感知卷积

2024

计算机技术与发展
陕西省计算机学会

计算机技术与发展

CSTPCD
影响因子:0.621
ISSN:1673-629X
年,卷(期):2024.34(12)