基于深度监督隐空间构建的语义分割改进方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：现有卷积操作在语义分割任务中难以有效捕捉长距离区域间的关系，导致分割结果不符合人类常识。为此，提出一种基于深度监督隐空间构建的语义分割改进方法。采用"特征图-隐空间-特征图"流程，将图像空间的像素特征转换为隐空间中的节点特征，将区域之间的位置和语义关系转换为节点之间的连接权重，实现了从特征图到隐空间的特征转换。在隐空间构建过程中，使用Kullback-Leibler散度损失函数监督投影矩阵，以避免从特征图到隐空间节点的转换过程中丢失特征;使用InfoNCE损失函数监督节点特征表征与真实标签表征，使得图像特征与标签保持一致。该方法在构建的隐空间上使用图神经网络进行语义推理，学习节点之间的关系，赋予模型学习区域间语义关系的能力，从而改善分割结果中的反常识现象。在公开数据集CityScapes上的实验结果表明，相比基线分割网络，该方法的平均交并比(mIoU)为81。1%，相较于基线分割网络mIoU提升2。6个百分点，能有效提升分割结果。

外文标题：Semantic Segmentation Improvement Method Based on Deep Supervision for the Construction of Latent Space

外文摘要：The existing convolution operations cannot effectively capture the relationships between long-distance regions in semantic segmentation tasks,resulting in segmentation results that do not conform to human common sense.Accordingly,a semantic segmentation improvement method based on deep supervised latent space construction is proposed.This article adopts the"feature map-hidden space-feature map"process to convert pixel features in an image space into node features in a hidden space,and convert the position and semantic relationships between regions into connection weights between nodes,thereby achieving feature conversion from the feature map to the hidden space.In the process of constructing the hidden space,the Kullback-Leibler divergence loss function is used to supervise the projection matrix,to avoid losing features during the transformation process from feature maps to hidden space nodes.It uses Information Noise Contrastive Estimation(InfoNCE)loss function to supervise node feature and real label representations,ensuring consistency between image features and labels.The proposed method uses Graph Neural Network(GNN)for semantic inference on the constructed latent space,learning the relationships between nodes and endowing the model with the ability to learn semantic relationships between regions,thereby improving the anti-common sense phenomenon in segmentation results.The experimental results on the publicly available dataset CityScapes demonstrate that compared to the baseline segmentation network,the mean Intersection over Union(mIoU)of the proposed method is 81.1%,which is 2.6 percentage points higher than that of the baseline segmentation network and can effectively improve the segmentation results.

外文关键词：

semantic segmentationConvolutional Neural Network(CNN)deep supervisionGraph Neural Network(GNN)anti-common sense phenomenon

作者：

王柏涵、姜晓燕、范柳伊

展开 >

作者单位：

上海工程技术大学电子电气工程学院,上海 201600

关键词：

语义分割卷积神经网络深度监督图神经网络反常识现象

基金：

国家自然科学基金联合项目国家自然科学基金重点项目

项目编号：

U203321861831018

出版年：

2024

DOI：

10.19678/j.issn.1000-3428.0067369

计算机工程

华东计算技术研究所　上海市计算机学会

计算机工程

CSTPCD北大核心

影响因子：0.581

ISSN：1000-3428

年,卷(期)：2024.50(3)

参考文献量29