基于注意力机制和多级校正的单目室内场景深度估计

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：场景的深度估计在三维视觉领域有着广泛的应用.针对单目室内场景深度估计精度低、细粒度信息预测能力差等问题,提出一种基于注意力机制和多级校正的单目深度估计网络.该网络首先采用混合自注意力Transformer和卷积神经网络的双分支模块提取彩色图像的多分辨率特征,然后利用基于空间域注意力机制的模块对提取的多分辨率特征进行渐进融合,最后通过多级校正的方式处理融合后的特征,并渐进地估计出不同分辨率的深度图像.实验结果表明,与同类方法相比,所提出的网络可有效提高深度图像细粒度信息的预测能力,网络的多个评价指标均有不同幅度的提升.

外文标题：Depth Estimation of Monocular Indoor Scenes Based on Attention Mechanism and Multi-level Correction

外文摘要：The depth estimation of scenes has a wide range of applications in the field of 3D vision.A monocular depth estimation network based on Attention Mechanism and multi-level correction is proposed to address the issues of low accuracy and poor prediction ability of fine-grained information in monocular indoor scene depth estimation.The network first uses a dual branch module with a self attention Transformer and a convolutional neural network to extract multi-resolution features of color images.Then,a module based on spatial domain Attention Mechanism is used to gradually fuse the extracted multi-resolution features.Finally,the fused features are processed through multi-level correction,and depth images with different resolutions are gradually estimated.The experimental results show that compared with similar methods,the proposed network can effectively improve the predictive ability of fine-grained information in depth images,and multiple evaluation indicators of the network have been improved to varying degrees.

外文关键词：

monocular depth estimationTransformerAttention Mechanismmulti-level correction

作者：

刘鹏、丁爱华、窦新宇

展开 >

作者单位：

唐山学院智能与信息工程学院,河北唐山 063000

关键词：

单目深度估计 Transformer 注意力机制多级校正

基金：

唐山市市级科技计划

项目编号：

22130205H

出版年：

2024

DOI：

10.19850/j.cnki.2096-4706.2024.05.023

现代信息科技

广东省电子学会

现代信息科技

ISSN：2096-4706

年,卷(期)：2024.8(5)

参考文献量20