基于注意力机制和多级校正的单目室内场景深度估计
Depth Estimation of Monocular Indoor Scenes Based on Attention Mechanism and Multi-level Correction
刘鹏 1丁爱华 1窦新宇1
作者信息
- 1. 唐山学院 智能与信息工程学院,河北 唐山 063000
- 折叠
摘要
场景的深度估计在三维视觉领域有着广泛的应用.针对单目室内场景深度估计精度低、细粒度信息预测能力差等问题,提出一种基于注意力机制和多级校正的单目深度估计网络.该网络首先采用混合自注意力Transformer和卷积神经网络的双分支模块提取彩色图像的多分辨率特征,然后利用基于空间域注意力机制的模块对提取的多分辨率特征进行渐进融合,最后通过多级校正的方式处理融合后的特征,并渐进地估计出不同分辨率的深度图像.实验结果表明,与同类方法相比,所提出的网络可有效提高深度图像细粒度信息的预测能力,网络的多个评价指标均有不同幅度的提升.
Abstract
The depth estimation of scenes has a wide range of applications in the field of 3D vision.A monocular depth estimation network based on Attention Mechanism and multi-level correction is proposed to address the issues of low accuracy and poor prediction ability of fine-grained information in monocular indoor scene depth estimation.The network first uses a dual branch module with a self attention Transformer and a convolutional neural network to extract multi-resolution features of color images.Then,a module based on spatial domain Attention Mechanism is used to gradually fuse the extracted multi-resolution features.Finally,the fused features are processed through multi-level correction,and depth images with different resolutions are gradually estimated.The experimental results show that compared with similar methods,the proposed network can effectively improve the predictive ability of fine-grained information in depth images,and multiple evaluation indicators of the network have been improved to varying degrees.
关键词
单目深度估计/Transformer/注意力机制/多级校正Key words
monocular depth estimation/Transformer/Attention Mechanism/multi-level correction引用本文复制引用
出版年
2024