基于DCNv2和Transformer Decoder的隧道衬砌裂缝高效检测模型研究

Research on efficient detection model of tunnel lining crack based on DCNv2 and Transformer Decoder

孙己龙 ¹刘勇 ²周黎伟 ²路鑫 ³侯小龙 ²王亚琼 ²王志丰²

扫码查看

作者信息

1. 陕西省交通运输工程质量监测鉴定站,陕西西安 710075
2. 长安大学公路学院,陕西西安 710064
3. 长安大学材料科学与工程学院,陕西西安 710061;西安公路研究院有限公司,陕西西安 710065
折叠

摘要

为解决因衬砌裂缝性状随机、分布密集、标注框分辨率低所导致的现有模型识别精度低、检测速度慢及参数量庞大等问题,以第 2 版可变形卷积网络(DCNv2)和端到端变换器解码器(Transformer Decoder)为基础对YOLOv8 网络框架进行改进,提出了面向衬砌裂缝的检测模型DTD-YOLOv8.首先,通过引入DCNv2对 YOLOv8 主干卷积网络 C2f 进行融合以实现模型对裂缝形变特征的准确快速感知,同时采用 Transformer Decoder 对 YOLOv8 检测头进行替换以实现端到端框架内完整目标检测流程,从而消除因 Anchor-free 处理模式所带来的计算消耗.采用自建裂缝数据集对SSD,Faster-RCNN,RT-DETR,YOLOv3,YOLOv5,YOLOv8和DTD-YOLOv8 的 7 种检测模型进行对比验证.结果表明:改进模型F1 分数和mAP@50 值分别为 87.05%和89.58%;其中F1 分数相较其他 6 种模型分别提高了 14.16%,7.68%,1.55%,41.36%,8.20%和 7.40%;mAP@50分别提高了 28.84%,15.47%,1.33%,47.65%,10.14%和 10.84%.改进模型参数量仅为RT-DETR的三分之一,检测单张图片的速度为 16.01 ms,FPS为 65.46 帧每秒,对比其他模型检测速度得到提升.该模型在面向运营隧道裂缝检测任务需求时能够表现出高效的性能.

Abstract

To address the problems of low recognition accuracy,slow detection speed,and large parameter quantities caused by the random and dense distribution of cracks in tunnel linings and low resolution of annotation boxes in existing models,the YOLOv8 network framework was improved based on the Deformable Convolution Network version 2(DCNv2)and end-to-end Transformer Decoder to propose a lining crack detection model DTD-YOLOv8.Firstly,DCNv2 was added to fuse the YOLOv8 backbone convolutional network C2f,enabling the model to accurately and quickly perceive crack deformation features.At the same time,the Transformer Decoder replaced the YOLOv8 detection head to achieve a complete object detection process within an end-to-end framework,thereby eliminating the computational consumption caused by the Anchor-free processing mode.A self-built crack dataset was used to compare and verify seven detection models,including SSD,Faster-RCNN,RT-DETR,YOLOv3,YOLOv5,YOLOv8,and DTD-YOLOv8.The results indicated that the F1 score and mAP@50 of DTD-YOLOv8 reached 87.05%and 89.58%,respectively.Compared to the other six models,the F1 score increased by 14.16%,7.68%,1.55%,41.36%,8.20%,and 7.40%,while the mAP@50 increased by 28.84%,15.47%,1.33%,47.65%,10.14%,and 10.84%.The parameter count of the new model was only one-third of RT-DETR,and the detection speed for a single image was 16.01 ms,with an FPS of 65.46 frames per second,demonstrating a speed improvement over other comparative model.The DTD-YOLOv8 model can demonstrate efficient performance in meeting the requirements of crack detection tasks in operational tunnels.

关键词

隧道工程/目标检测/第2版可变形卷积网络/Transformer/Decoder/衬砌裂缝

Key words

tunnel engineering/object detection/deformable convolution network v2/Transformer Decoder/lining crack

引用本文复制引用

基金项目

国家重点研发计划项目(2021YFB2600404)

陕西省交通运输厅交通科技项目(22-09K)

陕西省创新能力支撑计划项目(2023-CX-TD-35)

陕西省重点研发计划项目(2023KXJ-159)

出版年

2024

图学学报

中国图学学会

图学学报

CSTPCDCSCD北大核心

影响因子：0.73

ISSN：2095-302X

参考文献量34

段落导航