基于轻量级Transformer的隧道裂缝分割

扫码查看

原文链接

万方数据
维普

中文摘要：裂缝检测对保证隧道结构安全至关重要,及时发现隧道裂缝缺陷,有利于降低工程维修成本和保障行车安全.然而,传统卷积神经网络在隧道裂缝检测任务中主要侧重提高检测精度和算法复杂度,如何平衡裂缝检测的精度和实时性是当前研究的一个难点.针对这一问题,本文提出一种基于轻量级Transformer的裂缝分割方法CrackViT.首先,采用卷积神经网络与Transformer混合的MobileViT网络构建裂缝特征提取网络,减少网络模型参数和计算量,并且有效提取裂缝图像全局信息和局部特征信息.然后,提出改进空洞空间金字塔池化解码器实现不同尺度的特征提取和信息融合,实现像素级概率分布.同时,裂缝图像存在细节信息缺失问题,引入高效通道注意力模块,增强对裂缝特征信息的提取能力.此外,针对裂缝与背景类别不平衡问题,设计了在线困难样本挖掘损失函数进行缓解.实验结果表明:在单个3050Ti GPU上,CrackViT算法最终在裂缝数据集上以63 FPS的速度获得了75.62％的IoU,模型参数量仅为2.43 M.CrackViT-L模型精度IoU为76.83％,模型参数量为3.56 M,模型推理速度达到61FPS.算法测试精度优于大多数主流模型,并且需要更少的模型参数.研究结果表明,CrackViT所预测的隧道裂缝分割图像边缘更加清晰和完整,保持推理速度的同时,能够有效检测裂缝,该算法有助于隧道裂缝检测实际应用.

外文标题：Tunnel crack segmentation based on lightweight Transformer

外文摘要：Crack detection is crucial to ensuring the safety of the tunnel structure,and the timely detection of tunnel crack defects is conducive to reducing the project maintenance cost and guaranteeing traffic safety. However,the traditional convolutional neural network in tunnel crack detection tasks mainly focuses on improving detection accuracy and algorithm complexity. How to balance accuracy and real-time crack detection is a difficult point in the current research. To address this problem,this paper proposed a crack segmentation method called CrackViT based on a lightweight Transformer. First,the MobileViT network,which is a hybrid of convolutional neural networks and Transformer,was used to construct a crack feature extraction network. It reduced the parameters of the network model and the amount of computation and efficiently extracts the global information and the local feature information of the crack image. Then,an improved atrous spatial pyramid pooling decoder was proposed to realize feature extraction and information fusion at different scales and achieve pixel-level probability distribution. Meanwhile,the crack image suffers from the problem of missing detail information,and an efficient channel attention module was introduced to enhance the extraction ability of the crack feature information. In addition,for the problem of imbalance between crack and background categories,an online difficult sample mining loss function was designed to mitigate it. The experimental results show that the CrackViT algorithm finally achieves 75.62％ IoU on the crack dataset with 63 FPS on a single 3050Ti GPU,with a model parameter count of only 2.43 M. The CrackViT-L model accuracy IoU is 76.83％,with a model parameter count of 3.56 M,and the model inference speed reaches 61 FPS. The algorithm's tested accuracy is better than most mainstream models and requires fewer model parameters. The results show that the edges of the tunnel crack segmentation images predicted by CrackViT are clearer and more complete,and the cracks can be effectively detected while maintaining the inference speed,which makes the algorithm useful for practical applications in tunnel crack detection.

外文关键词：

crack segmentationTransformerMobileViTatrous spatial pyramid poollightweight model

作者：

邝先验、徐姚明、雷卉、程福军、桓湘澜

展开 >

作者单位：

江西理工大学电气工程与自动化学院,江西赣州 341000

关键词：

裂缝分割 Transformer MobileViT 空洞空间金字塔池化轻量级模型

基金：

国家自然科学基金资助项目国家自然科学基金资助项目

项目编号：

5126801772061016

出版年：

2024

DOI：

10.19713/j.cnki.43-1423/u.T20231768

铁道科学与工程学报

中南大学中国铁道学会

铁道科学与工程学报

CSTPCD北大核心EI

影响因子：0.837

ISSN：1672-7029

年,卷(期)：2024.21(8)