Unsupervised monocular depth estimation based on semantic information
With the development of deep learning,unsupervised monocular depth estimation has become a research hot spot in computer vision.However,there is a serious problem,such as unclear outline of depth image and inaccurate depth estimation.In view of the above problems,based on the encoder-decoder architecture,an unsupervised monocular depth estimation network based on semantic information is proposed.In order to obtain clearer contour information,the semantic information is refined through the atrous spatial pyramid pooling(ASPP) layer between the encoder-decoder architecture to improve quality of the generated images.The network realizes the extraction of multi-resolution features through the skip layer connection from the encoder to the decoder.In the encoder part,an improved high-resolution network (HRNet ) is used to fuse the multi-resolution features of different layers,and uses a concatenation strategy to fuse the outputs of the intermediate stages before decoding to improve the accuracy of depth estimation.The experimental results on the KITTI dataset show that the error evaluation index of the proposed method is lower than the current unsupervised monocular depth estimation method,reaching 89.4%,96.3% and 98.1% on the three accuracy evaluation indexes,which has good accuracy.