三维人脸成像及重建技术综述

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：得益于新型三维视觉测量技术及深度学习模型的飞速发展,三维视觉成为人工智能、虚拟现实等领域的重要支撑技术,三维人脸成像及重建技术取得了突破性进展,不仅能够更好地应对光照、遮挡、表情和姿态等变化,同时增大了伪造攻击难度,大大推动了真实感"虚拟数字人"的重建与渲染,有效提升了人脸系统的安全性.本文对三维人脸成像技术和重建模型进行了全面综述,尤其对基于深度学习的三维人脸重建进行系统深入地分析.首先,对三维人脸成像设备及采集系统进行详细梳理及对比归纳,并介绍了基于新传感技术的人脸成像系统;然后,对基于深度学习的三维人脸重建模型进行系统分析,从输入数据源角度分为基于单目图像、基于多目图像、基于视频和基于语音的三维人脸重建算法4类.通过深入分析,总结三维人脸成像的研究现状及面临的难点与挑战,对未来发展方向及应用进行积极探讨与展望.本文涵盖了近5年经典的三维人脸成像及重建相关的技术与研究,为人脸研究、发展和应用提供了很好的参考.

外文标题：3D face imaging and reconstruction technology:a review

外文摘要：As the breakthrough technology of artificial intelligence(AI)in the big data era,deep learning(DL)has prompted the renewed upsurge of face technology.Powered by rapid developments of new technologies,such as three-dimensional(3D)vision measurement,image processing chips,and DL models,3D vision transformed into a key support-ing technology in Al,visual reality,etc.The studies and applications of 3D facial imaging and reconstruction technologies have achieved important breakthroughs.3D face data represent exact multidimensional facial attributes on account of rich visual information,such as texture,shape,space,etc.Moreover,3D face data shows robust changes in large occlusions,expressions,and poses and increases the difficulty of forgery attack.Therefore,3D face imaging and reconstruction effec-tively promote realistic"virtual digital human"reconstruction and rendering.In addition,these processes contribute to the improved security of the face system.In this paper,we comprehensively study the 3D face imaging technology and recon-struction models.The 3D face reconstruction methods based on DL are systematically and deeply analyzed.First,the development and innovation of 3D face imaging devices and capturing systems are discussed through a summary of public 3D face datasets.The devices and systems include consumer imaging devices(such as Kinect)and complex hybrid sys-tems that fuse active and passive 3D imaging technologies to achieve precise geometry and appearance.Moreover,3D face imaging based on new sensing technologies are introduced.Then,from the perspective of input resources,3D face recon-struction methods based on DL are categorized into monocular,multiview,video and audio reconstruction methods.3D face imaging technology introduces public classic 3D face datasets,popular 3D face imaging devices,and capturing sys-tems.Most high-quality 3D face datasets,such as BU-3DFE,FaceScape and FaceVerse,are captured through a large imaging volume with a certain number of high-resolution cameras and controlled lighting conditions.They play key roles in applications of realistic rendering,driven animation,retargeting,etc.On the other hand,novel optical devices and imag-ing modules with small size and lightweight algorithm must be innovated for tiny AI as intelligent mobile devices.For 3D face reconstruction based on DL,monocular reconstruction has become the most popular technology.The state-of-the-art 3D face reconstruction method is generally self-supervised training on large-scale 2D face databases.The difficulties encountered in 3D face reconstruction include the lack of large-scale 3D face datasets,occlusions and poses of in-the-wild 2D face images,continuous expression deformations,etc.The DL network structure is categorized into general deep convo-lutional neural network(such as ResNet,U-Net,and Autoencoder),generative adversarial networks(GANs),implicit neural representation(INR)(such as neural radiance field(NeRF)and signed distance functions(SDF)),and Trans-former.3DMM and FLAME are widely used 3D face representation models.The StyleGAN model gives excellent perfor-mance in recovering high-quality face texture.INR has achieved remarkable results in 3D scene reconstruction,and the NeRF model plays an important role in the reconstruction of accurate head avatars.The combination of NeRF with GAN shows great potential in the reconstruction of high-fidelity 3D face geometry and realistic rendering appearances.Moreover,the Transformer model,which greatly improves the breakthrough of accuracy and speed,is mainly used in audio-driven 3D face reconstruction.Through in-depth analyses,the research difficulties accompanying 3D face are summarized,and future developments are actively being discussed and explored.Although recent research has made amazing progresses,challenges on how to improve the robustness and generalization to real-world lighting,extreme expressions/poses,and how to effectively disentangle facial attributes(such as identity,expression,albedo,and specular reflectance)and recover accurate detailed geometry of facial motions(such as wrinkles).In this study,we proposed a comprehensive and system-atic review and covered classical technologies and studies on 3D face imaging and reconstruction in the last five years to pro-vide a good reference for face studies,developments,and applications.

外文关键词：

3D face imaging3D face reconstructiondeep learning(DL)generative adversarial network(GAN)implicit neural representation(INR)

作者：

刘菲、张堃博、杨青、周树波、王云龙、孙哲南

展开 >

作者单位：

首都经济贸易大学管理工程学院,北京 100070

中国科学院自动化研究所模式识别实验室,北京 100190

东华大学信息科学与技术学院,上海 201620

关键词：

三维人脸成像三维人脸重建深度学习(DL) 生成对抗网络(GAN) 隐式神经表示(INR)

基金：

国家自然科学基金项目国家自然科学基金项目国家自然科学基金项目国家自然科学基金项目国家自然科学基金项目

项目编号：

618061976180337262071468U23B205462276263

出版年：

2024

DOI：

10.11834/jig.230697

中国图象图形学报

中国科学院遥感应用研究所,中国图象图形学学会 ,北京应用物理与计算数学研究所

中国图象图形学报

CSTPCD北大核心

影响因子：1.111

ISSN：1006-8961

年,卷(期)：2024.29(9)