基于解耦区域校准的高分辨率超像素生成算法

扫码查看

原文链接

万方数据
维普

中文摘要：超像素分割是计算机视觉领域的一项重要任务,该任务将具有相似属性的像素分组到称为超像素的簇中.图像超像素不仅可以增益图像注释,而且还是各种下游应用的基础,如分割、光流估计和深度估计.尽管超像素分割技术取得了显著进展,特别是随着深度学习方法的出现,但现有解决方案由于GPU内存和计算能力的限制,一直无法有效处理高分辨率图像.针对这个问题,作者提出了一种名为区域解耦校准的高分辨率超像素网络(Patch Calibration Network,PCNet)的新型深度学习框架,通过采用解耦的一致性学习策略,解决了现有方法的局限性.这种方法允许通过从低分辨率输入预测高分辨率输出来高效生成高分辨率超像素结果,从而绕过了GPU内存限制.PCNet的一个关键贡献是解耦的区域块校准(DPC)分支,它将高分辨率图像块作为额外输入,以保留细节并增强边界像素分配.为了改善边界像素的识别,作者利用二进制掩模设计了一种动态引导训练机制.这种机制鼓励网络专注于区域内的主要边界,将任务从多类分类简化为二分类问题.这一创新策略不仅减少了网络优化的复杂性,而且显著提高了边界检测的精度.本文通过在包括Mapillary Vistas、BIG和新创建的Face-Human数据集在内的多样化数据集上进行广泛的实验,证明了PCNet的有效性.结果表明,PCNet能够成功处理5K分辨率图像,并与现有的最先进的SCN方法相比,实现了更优越的性能,后者在处理高分辨率输入时存在困难.作者的贡献包括开发了PCNet,一种针对高分辨率超像素分割的深度学习解决方案,引入了解耦的区域校准架构,并构建了一个超高分辨率基准测试集,用于评估高分辨率场景中超像素分割算法的性能.本文首先回顾了超像素分割领域的相关工作,然后详细介绍了PCNet框架,接着展示了实验结果并与最先进的方法进行了比较.结论部分总结了研究结果并概述了未来研究的潜在方向.代码、预训练模型和新的基准数据集的可用性无疑将促进高分辨率超像素分割领域的进一步发展.总之,本文在超像素分割领域提供了一个重要的进步,提供了一种能够高效、准确处理高分辨率图像的解决方案.所提出的PCNet框架,凭借其创新的DPC分支和动态引导训练机制,为未来在计算机视觉领域的研究和应用提供了一个有前景的方向.本文的代码、预训练模型以及新构建的评估基准数据集可在https://github.com/wangyxxjtu/PCNet 上获取.

外文标题：Generating Superpixels for High-Resolution Images with Decoupled Patch Calibration

外文摘要：Superpixel segmentation is a significant task in the field of computer vision that involves grouping pixels with similar attributes into coherent clusters known as superpixels.These superpixels are not only useful for image annotation but also serve as a foundation for various downstream applications such as segmentation,optical flow estimation,and depth estimation.Despite the substantial progress in superpixel segmentation techniques,particularly with the advent of deep learning methods,existing solutions have been unable to effectively handle high-resolution images due to constraints in GPU memory and computational power.The authors propose the Patch Calibration Network(PCNet),a novel deep learning framework that addresses the limitations of current methods by employing a decoupled consistency learning strategy.This approach allows for the efficient generation of high-resolution superpixels by predicting high-resolution outputs from low-resolution inputs,thereby bypassing the GPU memory limitations.A key aspect of PCNet is the Decoupled Patch Calibration(DPC)branch,which incorporates high-resolution image patches as additional inputs to preserve fine details and enhance boundary pixel allocation.To improve the identification of boundary pixels,the authors introduce a dynamic guidance training mechanism that utilizes a binary mask.This mechanism encourages the network to focus on the primary boundaries within a region,simplifying the task from multi-class classification to a binary classification problem.This innovative strategy not only reduces the complexity of network optimization but also significantly enhances the precision of boundary detection.The paper demon-strates the effectiveness of PCNet through extensive experiments on diverse datasets,including Mapillary Vistas,BIG,and a newly created Face-Human dataset.The results indicate that PCNet can successfully process 5K resolution images and achieve superior performance compared to the state-of-the-art SCN method,which struggles with high-resolution inputs.The authors'contributions include the development of PCNet,a deep learning solution for high-resolution superpixel segmentation,the introduction of a decoupled regional calibration architecture,and the construction of an ultra-high-resolution benchmark dataset for evaluating the performance of superpixel segmentation algorithms in high-resolution scenarios.The paper is structured to first review the related work in the field of superpixel segmentation,then present the PCNet framework in detail,followed by experimental results and comparisons with state-of-the-art methods.The conclusion summarizes the findings and outlines potential directions for future research.The availability of code,pre-trained models,and the new benchmark dataset will undoubtedly facilitate further advancements in the field of high-resolution superpixel segmentation.In summary,this paper presents a significant advancement in the domain of superpixel segmentation,providing a solution that can handle high-resolution images efficiently and accurately.The proposed PCNet framework,with its innovative DPC branch and dynamic guidance training mechanism,offers a promising direction for future research and applications in computer vision.Our code,pre-trained models,and the newly constructed evaluation benchmark dataset are available at https://github.com/wangyxxjtu/PCNet.

外文关键词：

superpixel segmentationimage segmentationhigh-resolution visiondeep learningartificial intelligence

作者：

王亚雄、魏云超、钱学明、朱利

展开 >

作者单位：

合肥工业大学计算机与信息学院合肥 230009

北京交通大学计算机与信息技术学院北京 100044

西安交通大学电信学部西安 710049

关键词：

超像素分割图像分割高分辨率视觉深度学习人工智能

基金：

国家自然科学基金中央高校基本科研业务费专项资金(合肥工业大学学术新人提升计划)

项目编号：

62302140JZ2024HGTB0261

出版年：

2024

DOI：

10.11897/SP.J.1016.2024.02664

计算机学报

中国计算机学会中国科学院计算技术研究所

计算机学报

CSTPCD北大核心

影响因子：3.18

ISSN：0254-4164

年,卷(期)：2024.47(11)