首页|融合多维特征的街景图像语义分割方法

融合多维特征的街景图像语义分割方法

扫码查看
为进一步提升深度学习语义分割方法在复杂街景图像上的分割精度,本文基于PointRend网络提出了一种融合多维特征(Multi-Dimensional Features,MDF)的街景图像语义分割网络(MDFNet).首先,通过构建目标区域增强模块优化特征提取子网络,在深度网络的每个卷积块自适应地细化中间特征图,从而强化对复杂街景图像多维特征信息的精细提取;接着,在特征融合时引入特征金字塔网格,使用不同的卷积核处理不同尺度的街景图像,从而更加全面地获取复杂街景图像各类目标的不同分辨率特征;最后,使用双解码头对图像细节进行更细致的恢复,得到逐像素分类的结果.实验结果表明,本文网络与DeepLabV3、SegFormer等其他优秀分割网络相比,在Cityscapes复杂街景数据集上分割精度更高,平均交并比达到了80.11%,相比于其他网络提升了3.51%以上,对复杂街景图像的理解力更强.
Semantic segmentation method for street images with multi-dimensional features
To further enhance the segmentation accuracy of deep learning semantic segmentation method on complex street images,this paper proposes a semantic segmentation network(MDFNet)incorporating multi-dimensional features based on PointRend network of street image.Firstly,the algorithm builds a target area enhancement module to optimize the feature extraction sub-network,which self-adaptively refines the intermediate feature map in each convolutional block of the deep network.Thus,the module enhances the fine extraction of multi-dimensional feature information of complex street images.Secondly,the paper introduces feature pyramid grid during feature fusion.The module uses different convolutional kernels to process street images of different scales.Thus,it obtains more comprehensively the different resolution features of various targets in complex street images.Finally,we use the double decoder to recover the details of the image in more detail to obtain the pixel-by-pixel classification results.The experimental results show that the network in this paper has higher segmentation accuracy on the Cityscapes dataset compared with other excellent networks such as DeepLabV3 and SegFormer.The mean intersection over union reaches 80.11%and an improvement of more than 3.51%compared to other networks.The method provides better understanding of images of complex street scenes.

semantic segmentationtarget area enhancementattention mechanismfeature pyramid gridmulti-dimensional features

朱磊、车晨洁、姚同钰、潘杨、张博

展开 >

西安工程大学 电子信息学院,陕西 西安 710600

语义分割 目标区域增强 注意力机制 特征金字塔网格 多维特征

国家自然科学基金陕西省重点研发计划陕西省自然科学基础研究计划

619713392019GY-1132019JQ-361

2024

液晶与显示
中科院长春光学精密机械与物理研究所 中国光学光电子行业协会液晶分会 中国物理学会液晶分会

液晶与显示

CSTPCD北大核心
影响因子:0.964
ISSN:1007-2780
年,卷(期):2024.39(7)
  • 3