首页|Semantics-aware transformer for 3D reconstruction from binocular images
Semantics-aware transformer for 3D reconstruction from binocular images
扫码查看
点击上方二维码区域,可以放大扫码查看
原文链接
NETL
NSTL
万方数据
维普
Existing multi-view three-dimensional(3D)reconstruction methods can only capture single type of feature from input view,failing to obtain fine-grained semantics for reconstructing the complex shapes.They rarely explore the semantic association between input views,leading to a rough 3D shape.To address these challenges,we propose a seman-tics-aware transformer(SATF)for 3D reconstruction.It is composed of two parallel view transformer encoders and a point cloud transformer decoder,and takes two red,green and blue(RGB)images as input and outputs a dense point cloud with richer details.Each view transformer encoder can learn a multi-level feature,facilitating characterizing fine-grained semantics from input view.The point cloud transformer decoder explores a semantically-associated fea-ture by aligning the semantics of two input views,which describes the semantic association between views.Further-more,it can generate a sparse point cloud using the semantically-associated feature.At last,the decoder enriches the sparse point cloud for producing a dense point cloud with richer details.Extensive experiments on the ShapeNet data-set show that our SATF outperforms the state-of-the-art methods.
JIA Xin、YANG Shourui、GUAN Diyi
展开 >
The Engineering Research Center of Learning-Based Intelligent System and the Key Laboratory of Computer Vision and System of Ministry of Education,Tianjin University of Technology,Tianjin 300384,China
Zhejiang University of Technology,Hangzhou 310014,China