MSMVT:多尺度和多视图Transformer半监督医学图像分割框架

MSMVT:Semi-Supervised Framework with Multi-Scale and Multi-View Transformer for Medical Image Segmentation

李飞翔 ¹降爱莲¹

扫码查看

作者信息

1. 太原理工大学计算机科学与技术学院(大数据学院),山西晋中 030600
折叠

摘要

近年来,Transformer在众多监督式计算机视觉任务中取得了显著进展,然而由于高质量医学标注图像的缺乏,其在半监督图像分割领域的性能仍有待提高.为此,提出了一种基于多尺度和多视图Transformer的半监督医学图像分割框架:MSMVT(multi-scale and multi-view transformer).鉴于对比学习在Transformer的预训练中取得的良好效果,设计了一个基于伪标签引导的多尺度原型对比学习模块.该模块利用图像金字塔数据增强技术,为无标签图像生成富有语义信息的多尺度原型表示;通过对比学习,强化了不同尺度原型之间的一致性,从而有效缓解了由标签稀缺性导致的Transformer训练不足的问题.此外,为了增强Transformer模型训练的稳定性,提出了多视图一致性学习策略.通过弱扰动视图,以校正多个强扰动视图.通过最小化不同视图之间的输出差异性,使得模型能够对不同扰动保持多层次的一致性.实验结果表明,当仅采用10%的标注比例时,提出的MSMVT框架在ACDC、LIDC和ISIC三个公共数据集上的DSC图像分割性能指标分别达到了88.93%、84.75%和85.38%,优于现有的半监督医学图像分割方法.

Abstract

In recent years,despite the Transformer's remarkable performance across various computer vision tasks,its ef-ficacy in the semi-supervised image segmentation domain remains limited due to the scarcity of high-quality medical im-age annotations.A semi-supervised medical image segmentation framework with multi-scale and multi-view Transformer is proposed in this paper,which is referred to as MSMVT.Given the promising results of contrastive learning in Trans-former pre-training,the paper designs a multi-scale prototype contrastive learning module guided by pseudo-labels.This module employs image pyramid data augmentation techniques to generate semantically rich multi-scale prototype repre-sentations for unlabeled images.Through contrastive learning,the consistency between prototypes of different scales is reinforced,effectively mitigating the issues caused by the scarcity of labels in Transformer training.Furthermore,to enhance the stability of Transformer model training,the paper proposes a multi-view consistency learning strategy.This strategy corrects multiple strongly augmented views using weakly-augmented views.By minimizing the output discrepan-cies between different views,the model maintains multi-level consistency across various augmentations.The experimental results show that the MSMVT framework proposed in this paper outperforms the existing semi-supervised medical image segmentation methods by achieving 88.93%,84.75%,and 85.38%for image segmentation performance metrics of DSC on three public datasets,namely,ACDC,LIDC,and ISIC,respectively,when only 10%annotation ratio is used.

关键词

半监督医学图像分割/伪标签/Transformer/多尺度/多视图

Key words

semi-supervised medical image segmentation/pseudo-labeling/Transformer/multi-scale/multi-view

引用本文复制引用

出版年

2025

计算机工程与应用

华北计算技术研究所

计算机工程与应用

CSCD北大核心

影响因子：0.683

ISSN：1002-8331

段落导航