计算机工程与应用2025,Vol.61Issue(1) :252-262.DOI:10.3778/j.issn.1002-8331.2308-0315

DCaT:面向高分辨率场景的轻量级语义分割模型

DCaT:Lightweight Semantic Segmentation Model for High-Resolution Scenes

黄科迪 黄鹤鸣 李伟 樊永红
计算机工程与应用2025,Vol.61Issue(1) :252-262.DOI:10.3778/j.issn.1002-8331.2308-0315

DCaT:面向高分辨率场景的轻量级语义分割模型

DCaT:Lightweight Semantic Segmentation Model for High-Resolution Scenes

黄科迪 1黄鹤鸣 1李伟 1樊永红1
扫码查看

作者信息

  • 1. 青海师范大学计算机学院,西宁 810008;藏语智能信息处理及应用国家重点实验,西宁 810008
  • 折叠

摘要

语义分割是计算机视觉中分析和理解场景的关键任务,但现有的分割模型需要较高的计算成本和内存需求,不适合高分辨率场景的轻量级语义分割.针对该问题,提出了一种新的面向高分辨率场景的轻量级语义分割模型DCaT.采用深度可分离卷积提取图像的局部语义;使用基于坐标感知和动态稀疏混合注意力的轻量级Trans-former 获取图像的全局语义;通过模块融合,在低级语义上注入高级语义;通过分割头输出像素预测标签.实验结果表明:与基线模型相比,DCaT在高分辨率数据集Cityscapes上的平均交并比提高了 1.5个百分点,模型复杂度降低了 26%,推理速度提升了 12%.实现了高分辨率场景下模型复杂度与性能之间的更好平衡,证明了 DCaT的有效性和实用性.

Abstract

Semantic segmentation is a critical task in computer vision for analyzing and understanding scenes.However,existing segmentation models require high computational costs and memory demands,which makes them unsuitable for lightweight semantic segmentation in high-resolution scenes.To address this issue,a novel lightweight semantic segmenta-tion model called DCaT has been proposed,specifically designed for high-resolution scenes.First,the model extracts the local low-level semantics of the image using deep separable convolution;second,the global high-level semantics of the image is obtained using a lightweight Transformer based on coordinate-aware and dynamic sparse mixed attention;then,the high-level semantics are injected into low-level semantics through the fusion module;and lastly,pixel prediction la-bels are outputted through the segmentation head.The experimental results of DCaT on the high-resolution dataset Cityscapes show that compared to the benchmark model,the mean intersection over union has improved by 1.5 percent-age points,the model's complexity has been reduced by 26%,and the inference speed has increased by 12%.A better bal-ance between model complexity and performance in high-resolution scenarios is achieved,thus demonstrating the effec-tiveness and practicality of DCaT.

关键词

语义分割/轻量化/高分辨率/Transformer/稀疏注意力

Key words

semantic segmentation/lightweight/high resolution/Transformer/sparse attention

引用本文复制引用

出版年

2025
计算机工程与应用
华北计算技术研究所

计算机工程与应用

CSCD北大核心
影响因子:0.683
ISSN:1002-8331
段落导航相关论文