基于三维特征和Transformer的数字化古籍文档图像矫正

Digital Ancient Book Document Image Correction Based on 3D Features and Transformer

扫码查看

原文链接

NETL
NSTL
万方数据

中文摘要：古籍文档图像矫正是古籍文档数字化中的一个关键环节,对提高古籍数字化质量具有重要的现实意义.针对古籍中普遍存在的氧化弯曲、粘连折叠、装订方式特殊等原因导致的形变复杂、矫正难度大的问题,本文提出了一种基于深度学习和三维特征信息提取的古籍文档图像矫正方法.首先使用U-Net形式的编码器-解码器提取古籍文档图像的三维特征,然后基于Transformer模型对得到的三维特征图进行后向映射,最后使用双线性插值得到矫正后的图像.为了验证所提出方法的有效性,在两个自制测试集上分别进行实验.实验结果表明,该方法在局部失真(Local Distortion,LD)概率上,相较于DewarpNet模型降低了2.61％～6.58％.实验证明所提出的方法能有效完成古籍文档图像的矫正任务,提升古籍数字化质量.

外文摘要：The image correction of ancient literature documents is a key link in the digitization of ancient literature documents,which has important practical significance for improving the quality of ancient literature digitization. This paper proposes a method for correcting ancient book document images based on deep learning and 3D feature information extraction,in response to the problems of complex deformation and difficult correction caused by oxidation bending,adhesive folding and special binding methods commonly found in ancient books. Firstly,use a U-Net encoder decoder to extract the three-dimensional features of ancient document images. Then,based on the Transformer model,perform backward mapping on the obtained 3D feature map. Finally,use bilinear interpolation to obtain the corrected image. To verify the effectiveness of the proposed method,experiments were conducted on two self-made test sets. The experimental results show that this method reduces Local Distortion (LD) by 2.61％～6.58％ compared to the DewarpNet model. The experimental results show that the proposed method can effectively complete the task of correcting ancient book document images and improve the digital quality of ancient books.

外文关键词：

ancient book imagesdocument image correction3D information extractionTransformerencoder-decoder

作者：

赵微、牟大中、李夏童、屈千林、曹鹏

展开 >

作者单位：

北京印刷学院高端印刷装备信号与信息处理北京市重点实验室,北京 102600

关键词：

古籍图像文档图像矫正三维信息提取 Transformer 编码器-解码器

出版年：

2024

北京印刷学院学报

北京印刷学院

北京印刷学院学报

影响因子：0.247

ISSN：1004-8626

年,卷(期)：2024.32(8)