Semantics-aware transformer for 3D reconstruction from binocular images

扫码查看

原文链接

NETL
NSTL
万方数据
维普

外文摘要：Existing multi-view three-dimensional(3D)reconstruction methods can only capture single type of feature from input view,failing to obtain fine-grained semantics for reconstructing the complex shapes.They rarely explore the semantic association between input views,leading to a rough 3D shape.To address these challenges,we propose a seman-tics-aware transformer(SATF)for 3D reconstruction.It is composed of two parallel view transformer encoders and a point cloud transformer decoder,and takes two red,green and blue(RGB)images as input and outputs a dense point cloud with richer details.Each view transformer encoder can learn a multi-level feature,facilitating characterizing fine-grained semantics from input view.The point cloud transformer decoder explores a semantically-associated fea-ture by aligning the semantics of two input views,which describes the semantic association between views.Further-more,it can generate a sparse point cloud using the semantically-associated feature.At last,the decoder enriches the sparse point cloud for producing a dense point cloud with richer details.Extensive experiments on the ShapeNet data-set show that our SATF outperforms the state-of-the-art methods.

作者：

JIA Xin、YANG Shourui、GUAN Diyi

展开 >

作者单位：

The Engineering Research Center of Learning-Based Intelligent System and the Key Laboratory of Computer Vision and System of Ministry of Education,Tianjin University of Technology,Tianjin 300384,China

Zhejiang University of Technology,Hangzhou 310014,China

基金：

国家重点研发计划国家自然科学基金国家自然科学基金国家自然科学基金国家自然科学基金

项目编号：

2018YFB130520061906134620201060049204830161925201

出版年：

2022

DOI：

10.1007/s11801-022-2055-0

光电子快报(英文版)

天津理工大学

光电子快报(英文版)

影响因子：0.641

ISSN：1673-1905

年,卷(期)：2022.18(5)

被引量1
参考文献量2