首页|基于YOLOv8n的表格线检测改进模型

基于YOLOv8n的表格线检测改进模型

扫码查看
在表格识别重构任务中,分割和合并的重构方法需要通过检测表格线进而重构出电子表格,因此表格线检测结果的质量直接决定了表格重构的效果。针对已有方法存在误检漏检的问题,提出一种YOLOv8n改进模型,用于表格线检测。在主干网络中,利用Swin Transformer的思想改进BottleneckCSP模块,可以捕捉到更长距离的上下文信息,提升对于大尺度表格线的识别能力;针对表格线细长密集的特点,引入蛇形动态卷积的思想改进C2f(CSPLayer_2Conv)模块,根据特征之间的空间关系自适应地调整卷积核的形状和位置,从而更好地捕获特征之间的相关性和局部细节,进而提高特征建模能力;通过CBAM(convolutional block attention module)注意力机制改造空间金字塔池化层,动态地调整特征图中每个通道和空间位置的重要性,从而增强特征图的区分能力;优化颈部结构,引入混洗卷积来改造颈部结构。实验结果表明,改进后的YOLOv8n模型在ICDAR_2013和PubTabNet数据集上,mAP@0。5:0。95、准确率、召回率分别提升了0。079、0。301、0。088,性能超过YOLO同系列模型。这些改进使得YOLOv8n模型在表格线检测任务中展现出了优异的性能,通过与合并方法结合,可以进一步提升表格重构的效果。
Improved Model for Table-Line Detection Based on YOLOv8n
In the task of table recognition and reconstruction,the methods of segmentation and merging rely on the detec-tion of table lines to reconstruct electronic spreadsheets.Consequently,the quality of table line detection result directly determines the efficacy of the table reconstruction.To address the issues of false and missed detections in existing methods,an enhanced YOLOv8n model for table line detection is proposed.In the backbone network,the BottleneckCSP module is refined using the Swin Transformer methodology,enabling the capture of extended-range contextual information and aug-menting the recognition capability for large-scale table lines.Given the elongated and dense nature of the table line,the C2f(CSPLayer_2 Conv)module is enhanced with the concept of snake dynamic convolution,which adaptively adjusts the shape and position of the convolution kernel according to the spatial relationships among features.This enhancement more effectively captures the correlation and local details between features,thereby improving the feature modeling capa-bility.Furthermore,the spatial pyramid pooling layer is modified using the CBAM(convolutional block attention module)attention mechanism,in order to dynamically adjust the significance of each channel and spatial position within the fea-ture map,thus enhancing the discriminative capacity of the feature map.The neck structure is optimized and reconstructed by incorporating shuffle convolution.The experimental results indicate that the improved YOLOv8n model achieves increases of 0.079、0.301、0.088 in mAP@0.5:0.95,precision and recall respectively,on the ICDAR_2013 and PubTabNet datasets,exceeding the performance of other YOLO series models and demonstrating effective application to table line detection tasks.These improvements enable the YOLOv8n model to exhibit superior performance in the task of table line detection.By integrating with the merging method,the effectiveness of table reconstruction can be further enhanced.

table lineYOLOv8nattention mechanismdynamic snake convolutionTransformerlight weight

韦超、钱春雨、黄启鹏、杜林轩、杨哲

展开 >

苏州大学 计算机科学与技术学院,江苏 苏州 215006

表格线 YOLOv8n 注意力机制 动态蛇形卷积 Transformer 轻量化

2025

计算机工程与应用
华北计算技术研究所

计算机工程与应用

北大核心
影响因子:0.683
ISSN:1002-8331
年,卷(期):2025.61(2)