为解决三维人脸重建方法DECA(Detailed Expression Capture and Animation)采用2D图像训练导致信息缺失所带来的重建形状不够准确和MICA(MetrIC FAce)方法缺乏高频细节以及遇到无法识别的人脸照片重建失败的问题,利用3D数据和更为鲁棒的人脸识别网络完成有监督和自监督混合训练,提出基于FLAME(Fitting Landmarks And Mor-phable Expression)人脸模型、AdaFace(Quality Adaptive Margin for Face Recognition)人脸识别网络和DECA框架的高精度细节融合两阶段人脸重建方法(FIne-grained Facial Re-construction,FiFR).在粗重建阶段通过Adaface身份编码器将2D图像编码至隐空间,由2D和3D数据训练的映射网络将编码转化为FLAME人脸模型的相关参数,生成粗重建结果;在精细重建阶段,参考DECA方法,通过细节一致性损失约束生成详细的UV置换贴图,增强人脸的高频细节,实现了单一图像的三维人脸精细重建.实验结果表明,FiFR比DECA方法重建结果平均误差减少了 14%,针对低分辨率图像误差减少达到了 18%;相对于MICA方法,重建人脸具有更多的高频细节.
Two-stage 3D Fine-grained Facial Reconstruction Method Based on AdaFace Optimization
In order to address the limitations of current 3D facial reconstruction methods,such as the inac-curacies stemming from training DECA(Detailed Expression Capture and Animation)on 2D images lead-ing to information loss,and the inability of MICA(MetrIc FAce)to handle high-frequency details and un-recognized facial images,the two-stage facial reconstruction approach termed FIne-grained Facial Recon-struction(FiFR)was suggested,which leverages 3D data and a more robust face recognition network for supervised and self-supervised mixed training.This method integrates the FLAME(Fitting Landmarks and Morphable Expression)facial model,the AdaFace(Quality Adaptive Margin for Face Recognition)face recognition network,and the DECA framework to achieve high-precision detail fusion.In the coarse reconstruction stage,Adaface identity encoders encoded 2D images into latent spaces,and a mapping net-work trained on 2D and 3D data transformed the encodings into relevant parameters of the FLAME model,generated coarse reconstruction results.In the fine reconstruction stage,inspired by DECA,a detail-con-sistency loss-constrained UV displacement map was generated to enhance the facial high-frequency details,achieving fine-grained facial reconstruction from a single image.Experimental results demonstrate that Fi-FR reduces the average reconstruction error by 14%compared to DECA,with an 18%reduction in error for low-resolution images.Furthermore,FiFR exhibits more high-frequency details compared to the MI-CA method.
3D facial reconstructiondeep learningneural networks