融合多维特征的街景图像语义分割方法

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
万方数据

中文摘要：为进一步提升深度学习语义分割方法在复杂街景图像上的分割精度,本文基于PointRend网络提出了一种融合多维特征(Multi-Dimensional Features,MDF)的街景图像语义分割网络(MDFNet).首先,通过构建目标区域增强模块优化特征提取子网络,在深度网络的每个卷积块自适应地细化中间特征图,从而强化对复杂街景图像多维特征信息的精细提取;接着,在特征融合时引入特征金字塔网格,使用不同的卷积核处理不同尺度的街景图像,从而更加全面地获取复杂街景图像各类目标的不同分辨率特征;最后,使用双解码头对图像细节进行更细致的恢复,得到逐像素分类的结果.实验结果表明,本文网络与DeepLabV3、SegFormer等其他优秀分割网络相比,在Cityscapes复杂街景数据集上分割精度更高,平均交并比达到了80.11%,相比于其他网络提升了3.51%以上,对复杂街景图像的理解力更强.

外文标题：Semantic segmentation method for street images with multi-dimensional features

外文摘要：To further enhance the segmentation accuracy of deep learning semantic segmentation method on complex street images,this paper proposes a semantic segmentation network(MDFNet)incorporating multi-dimensional features based on PointRend network of street image.Firstly,the algorithm builds a target area enhancement module to optimize the feature extraction sub-network,which self-adaptively refines the intermediate feature map in each convolutional block of the deep network.Thus,the module enhances the fine extraction of multi-dimensional feature information of complex street images.Secondly,the paper introduces feature pyramid grid during feature fusion.The module uses different convolutional kernels to process street images of different scales.Thus,it obtains more comprehensively the different resolution features of various targets in complex street images.Finally,we use the double decoder to recover the details of the image in more detail to obtain the pixel-by-pixel classification results.The experimental results show that the network in this paper has higher segmentation accuracy on the Cityscapes dataset compared with other excellent networks such as DeepLabV3 and SegFormer.The mean intersection over union reaches 80.11%and an improvement of more than 3.51%compared to other networks.The method provides better understanding of images of complex street scenes.

外文关键词：

semantic segmentationtarget area enhancementattention mechanismfeature pyramid gridmulti-dimensional features

作者：

朱磊、车晨洁、姚同钰、潘杨、张博

展开 >

作者单位：

西安工程大学电子信息学院,陕西西安 710600

关键词：

语义分割目标区域增强注意力机制特征金字塔网格多维特征

基金：

国家自然科学基金陕西省重点研发计划陕西省自然科学基础研究计划

项目编号：

619713392019GY-1132019JQ-361

出版年：

2024

DOI：

10.37188/CJLCD.2023-0208

液晶与显示

中科院长春光学精密机械与物理研究所中国光学光电子行业协会液晶分会中国物理学会液晶分会

液晶与显示

CSTPCD北大核心

影响因子：0.964

ISSN：1007-2780

年,卷(期)：2024.39(7)

参考文献量3