基于胶囊网络和语言模型的政务文字识别

Character recognition for government affairs based on capsule network and language model

扫码查看

原文链接

国家科技期刊平台
NETL
NSTL
维普
万方数据

中文摘要：文字识别是计算机视觉领域中的重要研究内容之一,为建设智能政务服务奠定了基础.然而政务图像质量参差不齐、字体风格多样,造成识别准确率偏低.针对上述问题,提出了一种结合胶囊网络和语言模型的CNLM模型,并将字符切割与胶囊网络进行结合.首先将政务图像数据集构造为文字识别图像和语言模型句子样本进行分阶段训练,一阶段通过公开字符切割数据集对视觉模型进行预训练,通过句子样本和已有结构化数据对语言模型进行预训练;二阶段将视觉模型与语言模型进行联合训练,并对它们的输出结果进行选择迭代,最后得到图像包含的文字序列信息.该方法在政务图像数据集和GA-HWDB数据集上测试,其准确率相比VisionLAN分别提高2.12%和2.69%.

外文摘要：Character recognition is one of the important research contents in the field of computer vision,which lays the foundation for building intelligent government services.However,the uneven quality of government images and diverse font styles cause the low recognition accuracy.In order to solve above problems,a CNLM model combining capsule network and language model is proposed,and the character cutting is combined with capsule network.Firstly,the government image dataset is constructed as character recognition images and sentence samples of the language model for training in stages,in the first stage,the visual model is pre-trained by public character cut dataset,and the language model is pre-trained by sentence samples and existing structured data.In the second stage,the visual model and language model are jointly trained,the output results of them are selected and iterated to finally obtain the text sequence information contained in the images.The method is tested on both the government image dataset and GA-HWDB dataset,and its accuracy is improved by 2.12%and 2.69%compared with VisionLAN.

外文关键词：

intelligent government affaircharacter recognitioncapsule networklanguage model

作者：

于龙洋、王德军、孟博、吴余龙、胡宗华、段伟

展开 >

作者单位：

中南民族大学计算机科学学院,武汉 430074

武汉力龙信息科技股份有限公司,武汉 430015

关键词：

智能政务文字识别胶囊网络语言模型

基金：

湖北省科技创新人才计划国家重点研发计划中南民族大学研究生学术创新基金

项目编号：

2023DJC0942020YFC15229003212023sycxjj168

出版年：

2024

DOI：

10.20056/j.cnki.ZNMDZK.20240314

中南民族大学学报(自然科学版)

中南民族大学

中南民族大学学报(自然科学版)

影响因子：0.536

ISSN：1672-4321

年,卷(期)：2024.43(3)

参考文献量15