基于轻量化卷积神经网络的文档版面分析算法

扫码查看

原文链接

万方数据
维普

中文摘要：现有的文档版面分析方法复杂,模型参数较多,且资源消耗较高,在低功耗移动终端上很难部署.因此,提出一种基于轻量化卷积神经网络的文档版面分析算法.首先,设计一种轻量化文档特征提取结构,通过结构重参数化实现隐式特征重用,提高文档特征提取的效率和速度.其次,引入SPD-Conv模块,通过空间转深度操作对特征图进行尺寸调整和通道数扩展,更好地保留细粒度信息,同时解决图像模糊和小型版面基元检测困难.最后,提出一种简洁的特征融合方法,并通过模型压缩实现性能和推理效率的平衡.实验结果显示,该方法在PubLayNet数据集上仅使用了 160万个模型参数,可达到93.8％的mAP@0.5:0.95得分.这说明该算法能够在减少参数数量的情况下实现出色的检测精度,能够满足移动终端环境下高性能文档布局分析的要求.

外文标题：A Document Layout Analysis Algorithm Based on Lightweight Convolutional Neural Networks

外文摘要：Current document layout analysis methods are often complex,characterized by numerous model parameters and high resource consumption,which presents challenges for deployment on low-power mobile devices.To address this issue,this study proposes a document layout analysis algorithm based on lightweight convolutional neural networks.Initially,a lightweight document feature extraction structure is designed to facilitate implicit feature reuse through structural reparameterization,thereby enhancing the efficiency and speed of document feature extraction.Subsequently,the inclusion of the SPD-Conv module resizes feature maps and expands channels through spatial to depth operations.This enhancement aids in preserving fine-grained information and resolves issues related to image blurriness and the detection of small layout elements.Lastly,a concise feature fusion technique is proposed to optimize the balance between model performance and inference efficiency through model compression.Experimental results demonstrate that the proposed method achieves a mAP@0.5:0.95 score of 93.8％on the PubLayNet dataset using only 1.6 million model parameters.This algorithmic innovation enables exceptional detection accuracy with a reduced parameter count,meeting the requirements for high-performance document layout analysis on mobile devices.

外文关键词：

document layout analysisconvolutional neural networkslightweightstructural reparameterization

作者：

蔡云冰、杨词慧、崔国昊、陈思宇

展开 >

作者单位：

南昌航空大学信息工程学院,南昌 330063

关键词：

文档版面分析卷积神经网络轻量化结构重参数化

出版年：

2024

DOI：