首页|基于域适应的图像语义分割综述

基于域适应的图像语义分割综述

扫码查看
随着深度学习技术的迅速发展,语义分割算法在性能提升的同时依赖于大规模成对图像数据及其耗时耗力的像素级标注.人工制作的合成图像以规模大、易标注的特点,替代真实图像有效降低了训练成本.然而,合成图像与真实图像的域间差异性降低了分割网络的泛化能力.针对域间差异问题,研究者提出域适应语义分割(Domain Adaptive Semantic Segmentation,DASS)算法.该算法通过提取合成图像与真实图像的跨域共享知识,减小域间差异,提升分割网络在真实图像上的泛化能力.本文根据网络结构对主流DASS算法进行分类,分析了不同算法的性能对比结果,并提出未来研究方向.研究结果表明:早期的DASS算法利用生成对抗网络对齐源域和目标域的边缘分布,但网络结构复杂,并且只能实现两域的全局对齐,无法实现不同类别之间的精细对齐,性能较低;后续算法逐渐转向自训练网络结构,利用预训练的分割网络在目标域生成伪标签,为下一轮训练提供监督,结构简单,性能表现优于早期算法;随着Transformer网络的出现,其强大的特征提取能力进一步提升了DASS算法的准确性.
A review on image semantic segmentation based on domain adaptation
With the rapid development of deep learning,semantic segmentation algorithms have seensignificant performance improvements. However,they heavily rely on large-scale paired image datas-ets and labor-intensive pixel-level annotations. Artificially synthesized images,characterized by their scalability and ease of annotation,effectively reduce training costs by replacing real images for train-ing. Nonetheless,the domain gap between synthetic and real images influences the generalization capa-bility of segmentation networks. To address this issue,Domain Adaptive Semantic Segmentation (DASS) algorithms aim to extract domain-invariant features,thereby minimizing domain gaps and en-hancing the network generalization on the target domain. This paper classifies mainstream DASS algo-rithms according to the network structure,analyzes the performance comparison results of different al-gorithms,and proposes future research directions. The results show that early DASS methods utilize generative adversarial networks to align the distribution between the source and target domains. How-ever,their network structure is complex,achieving only global alignment and unable to realize fine alignment between different categories,resulting in lower performance. Subsequent methods gradually turn to the self-training networks,utilizing pre-trained segmentation network to generate pseudo la-bels in the target domain,providing supervision for the next round of training. This approach has a sim-pler structure and better performance. With the advent of Transformer,their powerful feature extrac-tion capability further improves the accuracy of existing DASS methods.

image semantic segmentationdeep learningdomain adaptive semantic segmentationgenerative adversarial networkself-training network

刘美琴、王子麟

展开 >

北京交通大学 计算机科学与技术学院,北京 100044

图像语义分割 深度学习 域适应语义分割 生成对抗网络 自训练网络

国家自然科学基金

62372036

2024

北京交通大学学报
北京交通大学

北京交通大学学报

CSTPCD北大核心
影响因子:0.525
ISSN:1673-0291
年,卷(期):2024.48(2)