计算机科学2024,Vol.51Issue(11) :174-181.DOI:10.11896/jsjkx.231000009

面向自动驾驶的高精度实时语义分割算法架构

High-precision Real-time Semantic Segmentation Algorithm Architecture for Autonomous Driving

耿焕同 李嘉兴 蒋骏 刘振宇 范子辰
计算机科学2024,Vol.51Issue(11) :174-181.DOI:10.11896/jsjkx.231000009

面向自动驾驶的高精度实时语义分割算法架构

High-precision Real-time Semantic Segmentation Algorithm Architecture for Autonomous Driving

耿焕同 1李嘉兴 2蒋骏 2刘振宇 2范子辰3
扫码查看

作者信息

  • 1. 南京信息工程大学计算机学院 南京 210044;中国气象局雷达气象重点开放实验室 南京 210044;江苏开放大学信息工程学院 南京 210036
  • 2. 南京信息工程大学计算机学院 南京 210044
  • 3. 南京信息工程大学软件学院 南京 210044
  • 折叠

摘要

PID(Proportion Integration Differentiation)语义分割架构缓解了双边架构中细节特征容易被周围的上下文信息淹没的问题(超调),同时取得了优越的性能.然而,该架构中高分辨率的边界分支严重影响了推理速度.针对此问题,提出了基于空间注意力机制和轻量辅助语义分支构建的高效PID架构.其中,轻量注意力融合模块用于提取精确的上下文信息并指导不同特征信息的融合,快速聚合金字塔池化模块能够快速聚合多种尺度的语义信息,并设计了一种结合Canny边缘检测算子的深监督训练策略以增强训练效果.与基线相比,所提模型以较小的时延代价换取了 6%的精度提升,并且在Cityscapes,Cam-Vid和KITTI数据集上取得了准确性和速度的良好平衡,精度超越了现有同一速度区间的模型.其中,所提模型在Cityscapes测试集上以120.9 frames/s的帧率达到了 78.5%的精度.

Abstract

The proportional integration differentiation(PID)semantic segmentation architecture mitigates the problem of over-shooting in the dual-branch architecture,where fine-grained features are easily overwhelmed by surrounding contextual informa-tion.However,the high-resolution boundary branch in this architecture significantly impacts the inference speed.To address this issue,an efficient PID architecture based on spatial attention mechanisms and a lightweight auxiliary semantic branch is proposed.The designed lightweight attention fusion module is used to extract precise contextual information and guide the fusion of various feature information.Additionally,a fast aggregation pyramid pooling module is introduced to rapidly aggregate semantic informa-tion across multiple scales.Finally,a deep supervision training strategy,combined with the canny edge detection operator,is de-signed to enhance the training effectiveness.In comparison to the baseline,the proposed model achieves a 6%increase in accuracy at the cost of a slightly increased latency.It strikes a good balance between accuracy and speed on the Cityscapes,CamVid,and KITTI datasets,outperforming existing models in the same speed range.Notably,the model achieves an accuracy of 78.5%at 120.9 frames/s on the Cityscapes test set.

关键词

实时语义分割/自动驾驶/超调/空间注意力机制/边缘检测

Key words

Real-time semantic segmentation/Autonomous driving/Overshoot/Spatial attention mechanism/Edge detection

引用本文复制引用

基金项目

国家自然科学基金(42375145)

中国气象局雷达气象重点开放实验室(2023LRM-A02)

出版年

2024
计算机科学
重庆西南信息有限公司(原科技部西南信息中心)

计算机科学

CSTPCDCSCD北大核心
影响因子:0.944
ISSN:1002-137X
参考文献量35
段落导航相关论文