液晶与显示2024,Vol.39Issue(10) :1402-1410.DOI:10.37188/CJLCD.2023-0362

基于神经辐射场遮挡优化的单视图三维重建方法

Single-view 3D reconstruction method based on neural radiation field occlusion optimization

陈志杰 邓慧萍 向森 吴谨
液晶与显示2024,Vol.39Issue(10) :1402-1410.DOI:10.37188/CJLCD.2023-0362

基于神经辐射场遮挡优化的单视图三维重建方法

Single-view 3D reconstruction method based on neural radiation field occlusion optimization

陈志杰 1邓慧萍 1向森 1吴谨1
扫码查看

作者信息

  • 1. 武汉科技大学 信息科学与工程学院,湖北 武汉 430081
  • 折叠

摘要

单视图的三维重建是通过输入单个视角的二维图像来恢复物体的三维几何形状或场景.由于单个视图的信息有限,遮挡会造成图像特征之间的模糊,从而无法恢复准确的物体外观细节.本文提出了一种基于神经辐射场(NeRF)的框架,充分利用图像的全局信息和局部上下文信息解决遮挡问题.首先,利用Vision Transformer具有捕获数据远程相关性的特性来学习图像的全局特征,并结合SE通道注意力机制模块来防止多层特征信息丢失和信息冗余;其次,利用卷积神经网络提取像素对齐的局部图像特征,结合空洞金字塔池化结构的空洞卷积以增大感受野,挖掘多尺度上下文信息,为恢复遮挡区域的细节提供更多的信息;最后,设计了一个基于Transformer的密度特征聚合模块,以减少遮挡引起的密度预测不准确.在ShapeNet-NMR数据集上的实验表明,该方法能合成具有更多细节的新视图,并且在应用于看不见的物体时表现出良好的泛化能力.

Abstract

Single-view 3D reconstruction aims to restore the three-dimensional geometry of an object or scene based on a single 2D image.The limited information provided by a single view often results in occlusion,leading to blurred image features and inhibiting the accurate recovery of object appearance details.This paper introduces a framework,called neural radiation field(NeRF),that effectively addresses the occlusion problem by utilizing both global and local context information of the image.The proposed approach employs the Vision Transformer to capture long-range correlations and learn global features from the image.The Vision Transformer is combined with the SE channel attention mechanism module to prevent information loss and redundancy across multiple layers.Additionally,a convolutional neural network is utilized to extract pixel-aligned local image features.The dilated convolution of the dilated pyramid pooling structure is employed to increase the receptive field,extract multi-scale context information,and provide more details for restoring occluded areas.Lastly,a density feature aggregation module based on the Transformer architecture is designed to minimize inaccuracies in density prediction due to occlusion.Experimental results on the ShapeNet-NMR dataset demonstrate the method's ability to produce new views with enhanced details and exhibit strong generalization capabilities when applied to unseen objects.

关键词

神经辐射场/遮挡/空洞卷积/Transformer

Key words

neural radiation field/shading/null convolution/transformer

引用本文复制引用

基金项目

国家自然科学基金(61702384)

国家自然科学基金(61502357)

出版年

2024
液晶与显示
中科院长春光学精密机械与物理研究所 中国光学光电子行业协会液晶分会 中国物理学会液晶分会

液晶与显示

CSTPCD北大核心
影响因子:0.964
ISSN:1007-2780
段落导航相关论文