人体图像精细化解析旨在为输入的人体图像进行像素级分类,属于细粒度的图像语义分割任务,由于具有广阔的应用场景,在近10年受到了研究者的关注,相关技术得以迅速发展.本文重点研究现有人体图像解析精细化模型对人体图像语义边缘的预测性能.首先,总结现有人体图像数据集,对比数据集在规模和标注类别方面的差异;其次,根据模型原理性差异,从通用图像语义分割、辅助信息引导、高分辨率特征增益和标签降噪4个方面对现有人体解析方法进行梳理和分类;再次,针对现有评估指标对于语义边缘区域预测能力敏感度不足的问题,构建新的评估指标,即平均边缘交并比(mean Boundary Intersection over Union,mBIoU),并用于对现有模型的评估,从数值上对比各方法的性能差异;最后,展望了人体解析未来的发展方向.研究结果表明:平均边缘交并比相较于现有指标能够更好地区分模型在语义边缘区域预测性能的差异,对人体图像精细化解析模型解决人体解析任务特有挑战的能力具有良好的评估作用,有利于未来算法的开发与性能评估.
Evaluation and analysis of accurate human body image parsing methods in semantic boundary regions
Accurate human body image parsing is a fine-grained image segmentation task aimed atpixel-level classification of human body images. Due to its wide range of applications,accurate human body image parsing has garnered significant attention from researchers in the past decade,leading to rapid advancements in related technologies. This paper focuses on evaluating the predictive performance of existing human body image parsing models in semantic boundary regions. Firstly,we provide a comprehensive overview of existing human parsing datasets,comparing their scale and annotation categories. Secondly,we classify existing methods based on their theoretical foundations,encompassing general semantic segmentation approaches,methods guided by auxiliary information,techniques for enhancing high-resolution features,and label denoising methodologies. Thirdly,addressing the limitations of current evaluation metrics in assessing predictive performance in boundary regions,we introduce a new evaluation metric,namely mean Boundary Intersection over Union (mBIoU),and employ it to evaluate existing models,quantitatively comparing their performance dif-ferences. Finally,we offer insights into future research directions. Our findings indicate that mBIoU,compared to existing metrics,better distinguishes models' predictive performance in semantic bound-ary regions. The proposed mBIoU effectively evaluates the capability of accurate human body image parsing models to address the specific challenges inherent in human parsing tasks,thereby facilitating algorithm development and performance evaluation.
computer visionimage semantic segmentationaccurate human body image parsingsemantic boundary performance