首页|轻量化人体和手部网格重建

轻量化人体和手部网格重建

扫码查看
三维人体网格重建在影视、虚拟现实等下游任务中有广泛应用.然而现有重建方式关注更好的重建精度和纹理表达,也因此更依赖高性能的计算或采集设备,缺乏对低成本、轻量化重建方式的研究.为降低人体重建任务的使用成本和硬件要求,本文提出了一种轻量化的人体和手部网格重建方式,基于参数化模型对身体和手部重建任务进行解耦,针对身体和手部的不同特点分别设计了不同分支网络.身体重建分支和手部重建分支均为编码器-解码器结构.身体重建分支编码器为双阶段编码器,第一阶段通过Litehrnet和Canny算子获得热点图和边缘图,并对图片进行代理表示,第二阶段通过Shufflenet提取全局特征,解码器通过级联低维度多层感知器以概率分布的方式对人体参数进行回归;手部重建分支的编码器以Litehrnet为主干网络获取多分辨率特征分支,通过姿态池化对多分辨率特征分支进行融合得到全局特征,解码器通过深度可分离卷积网络获得手部顶点,并通过MLP对形状进行估计,利用顶点坐标基于逆向拓扑数学求解得到关节旋转参数.与现有方法相比,参数量和计算量显著减少,整体参数量为6.12M,计算量为433M,且具有较好的重建效果,在Human3.6M数据集中平均关节点误差(MPJPE)为86.7 mm,手部重建分支在FreiHand数据集上对齐后平均关节点误差(PA-MPJPE)为10.8 mm.此外该方法完成了在移动设备的推理,在骁龙8Gen3处理器推理速度为79.7 ms(12.5 fps),可以达到实时推理的效果.
A Lightweight Method for Human Body and Hand Mesh Reconstruction
The use of 3D human body reconstruction shows substantial potential across various domains,including film and television production and virtual reality.Notably,the prevailing reconstruction methodologies predominantly em-phasize the refinement of reconstruction accuracy and texture articulation,often necessitating high-performance comput-ing or sophisticated acquisition apparatus.Nonetheless,the current landscape exhibits a dearth of investigations into cost-effective and lightweight reconstruction techniques.In response to the imperative to alleviate usage costs and hard-ware requisites associated with human body reconstruction,this study proposes a strategy that entails the disentangle-ment of body and hand components grounded in a parameterized human body model.Subsequently,distinct reconstruc-tion networks have been tailored to accommodate the distinctive movement characteristics of the body and hands,offer-ing a judicious balance between computational parsimony and performance robustness.Both the body and hand recon-struction modules adopt an encoder-decoder architecture.The encoder segment of the body reconstruction module fea-tures a dual-stage design.Initially,leveraging Litehrnet and Canny edge algorithms,we derive heatmaps and edge maps,which serve as surrogate representations for RGB images,facilitating the acquisition of preliminary features through downsampling and concatenation.Because of the challenges of directly extracting adequate features from RGB images via lightweight backbone networks,the images are represented using edge maps and heatmaps.Subsequently,global features are procured in the second stage by Shufflenet.To improve performance,the activation function has been modified.To reduce parameter count while ensuring reconstruction efficacy,low-dimensional MLPs are used to esti-mate parameters based on probability distributions.Shape parameters are derived via a single MLP based on the Gauss-ian distribution,and pose parameters are estimated sequentially for each joint point utilizing cascaded low-dimensional MLPs guided by the Fisher matrix distribution.For the hand reconstruction branch,reconstruction is conducted based on vertex regression,and parameters are obtained via hand vertices.Conversely,the encoder of the hand reconstruction branch employs Litehrnet to yield multi-resolution feature branches.Although high-resolution features coupled with shal-low features exhibit enhanced granularity expression and low-resolution features afford superior global perception,we employ interpolation for pose pooling and fuse high-and low-resolution features to reconcile these disparate characteris-tics.Subsequently,the decoder employs a DSConv and upsample network to derive hand vertices.Shape parameters are estimated via MLP based on hand vertices,and joint rotation parameters are derived from vertex coordinates employing inverse topology mathematics.Compared to extant methodologies,the proposed method yields a notable reduction in pa-rameter and computational requisites,with an overall parameter count of 6.12M and a computational load of 433M.Evaluation of the Human3.6M dataset showcases an MPJPE of 86.7 mm for the body reconstruction branch,outperform-ing the classical method HMR(88.0 mm)with a parameter size representing only 11.6%of HMR.Moreover,the recon-structed mesh PA-MPJPE of 10.8 mm for the hand reconstruction branch surpasses regression-based full-body recon-struction methods such as ExPose and PIXIE,with parameter quantities of 4.7%and 3.1%,respectively.Furthermore,deployment on mobile devices for real-time inference,facilitated by Android Studio and PyTorch Android,yields an in-ference speed of 79.7 ms(12.5 fps)on Snapdragon 8Gen3,thereby meeting the exigencies of real-time inference applications.

human body reconstructionlightweight networkSMPL+HMANO

安平、刘熠尧、周敏、黄新彭、杨超

展开 >

上海大学通信与信息工程学院,特种光纤与光接入网重点实验室,上海 200444

人体重建 轻量化网络 SMPL+H模型 MANO模型

国家自然科学基金国家自然科学基金国家自然科学基金国家自然科学基金上海市科学技术委员会

6202010601162071287623712786237127920DZ2290100

2024

信号处理
中国电子学会

信号处理

CSTPCD北大核心
影响因子:1.502
ISSN:1003-0530
年,卷(期):2024.40(7)