首页|基于激活调制的双分支弱监督语义分割

基于激活调制的双分支弱监督语义分割

扫码查看
图像级标注的语义分割因具有友好的注释和令人满意的性能而被广泛研究.针对类激活图激活区域稀疏、前背景间语义模糊问题,提出基于激活调制的双分支弱监督语义分割网络.该网络以 Resnet50和 Vision Transformer作为双分支特征提取网络,并设计激活调制模块嵌入卷积分支,该模块迫使模型激活中间分数的像素,生成紧凑的类激活图,从而缓解类激活图激活区域稀疏的问题.其次,提出基于余弦退火衰减的动态阈值调整策略,该策略在训练过程中自适应的确定背景最高阈值,使更多低置信前景像素参与到分割训练中,生成完整且准确的分割图.在PASCAL VOC 2012以及 MS COCO 2014数据集上验证该网络的有效性.PASCAL VOC 2012验证集和测试集上的mIou值分别为74.2%和74.0%,在MS COCO 2014验证集上的mIou值为45.9%.实验结果表明,该网络可以解决前背景颜色相似场景下的误分割问题并取得优异的分割性能.
Dual branch weakly supervised semantic segmentation based on activation modulation
Semantic segmentation with image-level annotation has been widely studied for its friendly annotation and satisfactory performance.Aiming at the problem of sparse activation regions and semantic ambiguity between foreground and background of class activation maps,a dual-branch weakly supervised semantic segmentation network based on activation modulation is proposed.The network uses Resnet50 and Vision Transformer as a two-branch feature extraction network,and designs an activation modulation module embedded in the convolutional branch,which forces the model to activate the intermediate fraction of pixels to generate a compact class activation map,thus alleviating the problem of sparse activation regions of class activation maps.Second,a dynamic threshold adjustment strategy based on cosine annealing decay is proposed,which adaptively determines the highest background threshold during the training process,so that more low-confidence foreground pixels are involved in the segmentation training,and complete and accurate segmentation maps are generated.The effectiveness of the network is verified on the PASCAL VOC 2012 as well as MS COCO 2014 datasets.mIou values are 74.2%and 74.0%on the PASCAL VOC 2012 validation and test sets,respectively,and 45.9%on the MS COCO 2014 validation set.The experimental results show that the network can solve the mis-segmentation problem and achieve excellent segmentation performance in scenes with similar front background colours.

weakly supervised learningsemantic segmentationclass activation mapactivation modulationdynamic threshold

王家莉、谭棉、冯夫健

展开 >

贵州民族大学数据科学与信息工程学院 贵阳 550025

贵州省模式识别与人工智能系统重点实验室 贵阳 550025

弱监督学习 语义分割 类激活图 激活调制 动态阈值

2024

电子测量技术
北京无线电技术研究所

电子测量技术

CSTPCD北大核心
影响因子:1.166
ISSN:1002-7300
年,卷(期):2024.47(24)