Depth Estimation of Monocular Indoor Scenes Based on Attention Mechanism and Multi-level Correction
The depth estimation of scenes has a wide range of applications in the field of 3D vision.A monocular depth estimation network based on Attention Mechanism and multi-level correction is proposed to address the issues of low accuracy and poor prediction ability of fine-grained information in monocular indoor scene depth estimation.The network first uses a dual branch module with a self attention Transformer and a convolutional neural network to extract multi-resolution features of color images.Then,a module based on spatial domain Attention Mechanism is used to gradually fuse the extracted multi-resolution features.Finally,the fused features are processed through multi-level correction,and depth images with different resolutions are gradually estimated.The experimental results show that compared with similar methods,the proposed network can effectively improve the predictive ability of fine-grained information in depth images,and multiple evaluation indicators of the network have been improved to varying degrees.