首页|基于MD-CycleGAN的手写表达式图像识别算法研究

基于MD-CycleGAN的手写表达式图像识别算法研究

扫码查看
针对使用生成对抗网络生成图像时词向量或者字符向量难以重建数学表达式中的二维结构的问题,将手写数学表达式的图像生成任务转换为印刷体数学表达式到手写体数学表达式的风格转换问题,并自建了 一个带有手写风格分类的数据集来训练风格转换模型。为了解决CycleGAN网络生成的图像内容不全、细节失真、质量不高的问题,设计了 一种多尺度判别循环一致性生成对抗网络MD-CycleGAN,引入了 CBAM注意力机制,弥补下采样环节信息丢失的问题,引入ACON激活函数代替ReLU激活函数,通过自适应学习控制网络每一层的非线性程度。实验结果表明基于生成对抗网络的数据增强方法能有效降低模型过拟合的程度。本研究为手写数学表达式图像的自动识别提供了 一种新的方法,克服了数据标注问题和模型泛化问题,具有广泛的应用潜力,包括数学教育、科学文档处理和数学搜索引擎等领域。
Research on handwritten expressions image recognition algorithm based on MD-CycleGAN
To address the problem that word vectors or character vectors are difficult to reconstruct the two-dimen-sional structure in mathematical expressions when using generative adversarial networks to generate images,the task of generating images with handwritten mathematical expressions is converted into a style conversion problem from printed mathematical expressions to handwritten mathematical expressions,and a self-constructed dataset with handwritten style categorization is used to train the style conversion model.In order to solve the problem of incomplete content,dis-torted details and low quality of images generated by CycleGAN network,a multi-scale discriminative cyclic consisten-cy generative adversarial network MD-CycleGAN is designed,which introduces the CBAM attention mechanism to compensate for the loss of information in the downsampling link,introduces the ACON activation function instead of the ReLU activation function,and controls the network through adaptive learning nonlinearity degree of each layer.The experimental results show that the data enhancement method based on generative adversarial network in this paper can effectively reduce the degree of model overfitting.This study provides a new method for automatic recognition of hand-written mathematical expression images,which overcomes the data annotation problem and the model generalization problem,and has the potential for a wide range of applications,including the fields of mathematics education,scientif-ic document processing,and mathematical search engines.

MD-CycleGANhandwritten mathematical expressionsimage recognitionneural network

吕闯、水卿梅

展开 >

重庆工商大学派斯学院,重庆 401520

MD-CycleGAN 手写数学表达式 图像识别 神经网络

国家自然科学基金

62302475

2024

激光杂志
重庆市光学机械研究所

激光杂志

CSTPCD北大核心
影响因子:0.74
ISSN:0253-2743
年,卷(期):2024.45(8)
  • 4